Closed (fixed)
Project:
Drupal core
Version:
8.6.x-dev
Component:
entity system
Priority:
Major
Category:
Feature request
Assigned:
Unassigned
Issue tags:
Reporter:
Created:
16 Aug 2016 at 14:18 UTC
Updated:
17 Aug 2018 at 11:41 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #2
timmillwoodOpening this issue to discuss the implementation details for workspaces.
In the Multiversion and Workspace modules we define workspaces as a content entity, then add a workspace field to all entity types to denote which workspace an entity belongs to, then every query has to check the workspace field.
I can't see this passing as a core solution, there are many alternatives, but none that get around adding a condition to each entity query.
Comment #3
timmillwoodComment #4
catchFor CPS there is a similar concept (called 'site versions'). The way it gets around the condition on every query was the following:
1. There is a hard-coded, 'published' site version. When viewing the 'published' site (i.e. not previewing), any query alters are skipped.
2. When previewing the site in a site version, you have to join on the tracking table, can't really avoid that.
3. Additionally, only entities modified in a site version are tracked in that table - it uses a CASE IF WHEN clause in the query alter to either get the default revision for untracked entities or the draft revision for entities in the workspace.
I'm not sure what the storage should look like for core, but conceptually this means the best performance when not previewing (i.e. no change at all), and allows for workspace storage to only track changes, not every entity. (After a site version is published the full history is stored, which does/has caused problems with scaling, but there are issues discussing that in the CPS queue).
Publishing a site version is then taking the 'target' revision for each entity and making it the default revision (then changing status of the site version itself) - this means the full process of publishing is usually a handful of entity saves, which can happen in a single transaction.
Comment #5
fabianx commented+1 to #4.
The query alter only when not in the 'published' workspace, did work very well for us.
Also a table + COALESCE() on the join did also wonders for us.
I think COALESCE is supported also for other DB engines.
Edit:
Uhm and cloning entities per workspace is a no-go for core. Should just use revisions and track those, but always work on 'live' if not overridden in the workspace (e.g. similar to an always rebased branch).
That also worked very well and scalable for us in CPS.
Comment #6
timmillwoodComing back to this (21 days later) I want to make sure I fully understand how this all works.
Then... where do we add the under laying concept of workspace? is adding it in a module stable enough? or should it go into core, much like translations are. Yes, to do translations you need the translation modules, but the entity API can do translations without it.
Comment #7
catch1. Save an initial, default revision. Regardless of the 'published' status of the entity, set it to unpublished. This way it can never show up on the live site.
2. Save a draft revision, with the published status however it was set on the entity, this one is attached to the workspace. Since this will be 'published' in the workspace (if it was set to published on the form), it'll show up.
CPS does this, except the revision saving is tricky because Drupal 7 doesn't allow you to save a revision without updating the base table, but the end result is the same.
This requires published/unpublished support for all entities that can be put into workspaces.
Not sure on the where question. First inclination is to put it in a module though, its not really needed except for whole-site-preview and publishing-sets-of-content-at-once, and sites can be built without either of those.
Comment #8
timmillwoodSo if all content is added as unpublished on all workspaces no matter which workspace the content was added, wouldn't this get confusing to people viewing /admin/content?
Comment #9
fabianx commented#8: A further part that CPS does here is to remove unpublished content (unpublished on the base entity) from listings while outside of a changeset / site version (e.g. in the live changeset).
And because you cannot just un-publish something, but need to go via the workflow, that works very well.
At least that is how I remember this works.
Comment #10
timmillwoodI think for this to work we need to be able to unpublished all of the things. #2810381: Add generic status field to ContentEntityBase is looking to introduce a generic 'status' base field, which we can then add to more entity types.
My current thinking is:
Comment #11
catchFor reference the way CPS does this is the following:
1. Publishing a workspace means 'make the revision associated with the workspace/site version the default one' - which either publishes a draft revision or publishes the entity in general if it's new). Ideally that should happen in a single transaction.
2. The workspace/site version gets 'archived' - so the historical information of what was in it is available. (CPS also records the state of all entities on the site at the time it was published, for historical review and rollback which I don't think we should do in core since it's an implementation nightmare).
Comment #12
timmillwoodToday I committed a 8.x-2.x branch of the Workspace module. I hope this will form the basis for the core experimental module.
So far I have:
Todo:
During this time I also need to work out how replicating content will actually work. The CPS approaches above will play a big part in this. I would also like to use the services we have in the Replication module to determine differences between workspaces. This will allow us to be flexible enough for multi-site replication as well as the core single site preview functionality.
To mark content as belonging to a workspace I'd like to use the Content Moderation approach of a content entity and computed entity reference fields. Thus adding a ContentWorkspace entity which references the content entity and the workspace it belongs to.
For the CPS approach to fully work we need BlockContent, Term, and MenuLinkContent entities (at a minimum) to be unpublishable, #2810381: Add generic status field to ContentEntityBase will go a long way to making this happen.
Comment #13
timmillwoodThe 8.2.x branch of Workspace module now has a ContentWorkspace entity type which gets updated when an entity is created or updated. The workspace field is a computed field which returns the Workspace via ContentWorkspace, or the default Workspace.
Trying to work out the best way of getting entities to only show in their correct workspace.
Comment #14
timmillwoodHere's an initial patch for Workspace module.
It will fail on a couple of things (like hook_help), but I will work on tidy ups today.
What it does do is introduce WorkspaceType and Workspace entities, it then has toolbar integration to switch between workspaces. Every entity that is added gets a ContentWorkspace entity to map it to a workspace, much like in Content Moderation how we map an entity to it's moderation state.
One thing I'd like to get done before commit is only showing entities which belong to the current workspace, but still looking for the best way to do this. @catch? @Fabianx?
Replication is something I think should be added as a follow-up, yes, the module is pretty useless without replication, but to do it properly it has a lot of dependencies.
Comment #16
timmillwoodTrying to fix some of the failures from #14
Comment #18
timmillwoodThis should fix the
\Drupal\system\Tests\Module\InstallUninstallTestissue.After speaking to @amateescu and @dixon there was concerns about getting Workspace module in without replication. Personally I'd like to get it in as a starting block so we can then work on multiple little patches, rather than one big one. Although I can understand the concerns with having a module in core that doesn't really do much.
Comment #19
dawehnerWell, one question I would ask myself here. What is the concrete usecase we solve with this additional experimental module. Can endusers somehow profit from it? Having an experimental module without any enduser features is indeed a big weird. I'm wondering whether we could implement a mini version of replication, or some other feature which workspaces would allow, even if it would be thrown away later.
Comment #20
timmillwood@dawehner - I think a mini version of replication would be cool, I remember we did do this in the early days of the contrib module and it did have some big bugs, but is specific use cases it worked ok. So maybe this could be one approach.
The other approach is just develop it in contrib for now, then push to core as one big patch when ready. To do replication properly (following the couchdb protocol) there are many dependencies.
Comment #21
timmillwoodCurrently blocked on #2831428: Introduce sorted set key value store.
Comment #23
timmillwoodI have been working on the workspace module for core in the 8.x-2.x branch of contrib, I'd happily take reviews there https://www.drupal.org/node/11627/git-instructions/8.x-2.x/nonmaintainer
It's still in pretty early stages, but taking influence from CPS no queries are altered on the "live" workspace, on all other workspace we alter queries, and hook into the node load to only show entities that are live or belong to the workspace. Basic replication of content between workspaces is also in place with a test to back this up.
Comment #24
xjm@timmillwood commented on #2755073: WI: Content Moderation module roadmap that the Workflow team wanted to target this for 8.4.
It's a ways down the roadmap in #2721129: Workflow Initiative and hadn't been discussed with release managers yet. We'd like to see Content Moderation and Workflow stabilize before we add additional workflow features to core. We'd also like to ensure bugs like #2856363: Path alias changes for draft revisions immediately leak into live site and issues with forward revisions are addressed before this is considered for core (they are probably also blockers for Content Moderation being stable anyway).
So, marking postponed for now based on that. Thanks @timmillwood!
Comment #25
xjmFWIW, #1239558: Deleting an entity with revisions and file/image field does not release file usage of non-default revisions, causing files to linger was also a must-have for Phase A and not complete, and this is Phase F. (Also the revision upgrade path is still listed as Phase A, but I'm unclear on the status of that.)
Comment #26
timmillwoodTime for an update, lets see if this works.
Comment #28
timmillwoodLet's give this another go.
Comment #30
timmillwoodok, fixed this test fail in contrib. Will add a new patch here in due course, but until then... please review!
Comment #31
timmillwoodComment #32
timmillwoodComment #34
timmillwood#32 was missing the following files:
Comment #36
timmillwoodSeems like the default workspaces added in workspace.install aren't installed during InstallUninstallTest, which I've no idea why. However, tried to make this less relevant by altering the access checks.
Hard to test locally because InstallUninstallTest runs so slowly.
Comment #38
dawehnerThis module is gigantic, and super hard to make a mind off.
A classical example why having short names doesn't help at all: This could mean includeDocuments or includeDocumentation ... and well in the end its all about entities anyway, so could this be named incldueEntitiess?
Given that I assume you cannot put translations in their own workspaces?
Do we have an issue to have a trait for that?
Couldn't we use 'default' as default workspace parameter always?
Documentation here should IMHO explain what a changeset means. I just assume you'll work on
https://www.drupal.org/core/gates#documentation-block-requirementsover time.I have a hard time undertanding why these things have be to container aware by default
These things sounds like they should be injected using a constructor
IMHO these queries should opt out of access checking.
I'm wondering whether as a first step it would be enough to just return the object, but not store it in the database?
Given these two things do basically the same (once vs. multiple) the names are quite different.
You mean you want to do something else than throwing an exception?
Comment #39
timmillwood@dawehner - Thank you so much for the review!
If it makes things easier, the /core/modules/workspace part of this module is the same as the 8.x-2.x branch of https://www.drupal.org/project/workspace, feel free to review / patch in the issue queue there. I'm sure I could talk @_dixon into giving out commit access to some people too.
Comment #40
timmillwood#38.1 - This is kinda inherited from CouchDB which is what the replication protocol is based on. With couchdb you can pass
include_docs=trueas a parameter to get changes including the documents. Changed to entities here, as it makes more sense. We'll map to include_docs when we come to rewrite Relaxed module in contrib.#38.2 - Updated the fields to not be translatable. We have a similar issue in Content Moderation #2881645: ContentModerationState 'workflow' column empty for all non-default translations.. The entity type is still translatable (even though it has no translatable fields). Will think about if there is a need for this still.
#38.3 - #1979260: Automatically populate the author default value with the current user.
#38.4 - I'm not sure what you mean.
#38.5 - Updated.
#38.6 - This was to get the default workspace param, changed to inject it to the service.
#38.7 - Fixed.
#38.8 - Fixed.
#38.9 - The ReplicationLog is used at the start of replication to know which sequence ID we last replicated up until. Then when the next replication is done we only replicate changes since the last replication.
#38.10 -
getEntityTypesThatCanBelongToWorkspaces()is a bit long isn't it? but maybe we could change the others toisEntityTypeSupported()andisEntitySupported()?#38.11 - (For context, this was inherited from https://github.com/dickolsson/drupal-multiversion/blame/8.x-1.x/src/Work...) Ideally we should throw a 403, but it can be complex from here.
Comment #41
timmillwoodA few tidy ups.
Comment #43
timmillwoodGot hit by the Node save button changes.
Comment #44
timmillwoodAdding missing "administer workspaces" permission and fixing issue with two submit buttons on the deployment form, then updating replication test to use the deployment form rather than the API directly.
Comment #45
yoroy commentedJust reviewed this with the workspace team, Gabor and moi:
The "Add new workspace" buttons is missing in admin. Otherwise, this MVP looks feature complete.
Switching workspaces is jarring: it happens (too) fast and there's little feedback that it actually happened. Wondering if there's enough (visual) feedback about what just happened when you switch workspaces. A first useful step could be to have a clear visual label that distinguishes the live workspace from all the others.
Another idea was the automatically disappearing message "you are now in workspace Foo". See #2773601: Display "You are now in edit mode" prompt when user enters edit mode for a similar treatment for Edit mode. (Showing messages inside the top tray is not useful when the tray closes between workspace switches)
The top tray could be made more compact (in height) in this version where there's no search available yet.
Feel free to add the "needs usability review" tag again when needed :)
Comment #46
timmillwoodAnd we're not using distinct anymore.
This can be removed too
We also saw /admin/content not working, and this is why.
Comment #47
pk188 commentedFixed #46 1 and 2.
Need more information for 3.
Comment #49
timmillwoodMoving back to "Needs work" because #47 didn't fix anything useful.
Comment #50
amateescu commentedHere's an initial architectural review. While going through the code base to get more familiar with it, I also fixed some problems that were standing out on a quick read.
\Drupal\workspace\Plugin\Upstream\Workspaceand it doesn't do aything other than returning the label of a workspace.None of the classes related to them have any kind of useful description:
\Drupal\workspace\UpstreamInterface,\Drupal\workspace\UpstreamManager,\Drupal\workspace\Annotation\Upstream,\Drupal\workspace\Plugin\Upstream\Workspace.Are they related to the replication stuff somehow?
Can we move the work done by
\Drupal\workspace\Replication\DefaultReplicatordirectly into\Drupal\workspace\Form\DeploymentFormsomehow, to simplify the initial patch? We can always put them back in a follow-up issue, even for 8.5.x.\Drupal\workspace\WorkspacePointerInterfaceis not used anywhere, can it be removed?\Drupal\workspace\ParamConverter\EntityRevisionConverter: Content Moderation also provides a similar parameter converter, we need a separate issue to put one in core and remove the one from CM and this one from the patch.view_revision_treespermission was not used anywhere, so I removed it.update any workspace from upstreamis also not used anywhere, but it sounds useful. Do we keep (and use) it or not?Comment #51
timmillwood#50.1 - The upstream is the place to replicate to, because the replication is flexible (it's a tagged service), the upstream needs to be equally flexible. It might not be a local workspace, it might not even be Drupal at all.
In contrib land the Workspace module generates
WorkspacePointercontent entities for each place you can replicate to, and entity reference field on the workspace entity links to the workspace pointer. This method isn't ideal, and plugins make more sense, but as you say it need a lot more documentation, and I guess a test module implementing another plugin would be useful.#50.2 - Yes, I guess we could. The replication has always followed the CouchDB protocol (http://docs.couchdb.org/en/2.0.0/replication/protocol.html), this is a tried and tested protocol for replication so it'd be good if we could follow that as close as possible, but doing it in the form could work initially.
#50.3 - yes, that was left over from the WorkspacePointer entity type, which was removed in favour of the upstream plugin.
#50.4 - yes, that'd be good. In the mean time it can remain here.
#50.5 - ok, that was left over from contrib.
#50.6 - This is a permission to allow users to pull changes down from the upstream. We should use it.
Comment #52
amateescu commented#51.1: Ok, so upstream plugins are related to replication as I suspected. Would it be possible to only replicate to local workspaces in the initial version of the module, and create a followup to introduce the Upstream plugins?
#51.2: Actually, we could do it somewhere in WorkspaceManager perhaps, and have the form call a method there.
#51.3: Ok, removed in this patch.
#51.4: TODO
#51.5: Cool :)
#51.6: TODO
Here's another round of review and fixes, this time focused mainly on the Workspace and ContentWorkspace entity types:
Missing storage schema implementation? This class obviously doesn't exist :)
I can't find any constraint validator in the module, is this supposed to be used or can it be removed?
The ContentModerationState entity type went through a lot of fixes since the last time this issue was touched, we need to bring over all those improvements.
This proves that #2685749: Add a 'machine_name' widget for string field types with a UniqueField constraint is needed in core.
What is this flag used for?
\Drupal\workspace\Replication\DefaultReplicator::replicate()calls this method with an ID composed of anmd5()hash, but here we store that hash in a UUID field. That seems very weird, why not just have a string ID as the main identifier for theReplicationLogentity?I'm guessing that this needs to stay in the initial patch, so we really need to describe what it is and how it's supposed to be used.
Is this really something that needs to be overrideable (i.e. a service in the container)? Can it be a utility class with no interface instead?
Alternatively, can't we use the key/value sorted set store directly instead of this extra layer of indirection?
\Drupal\workspace\KeyValueStore\KeyValueStoreSortedSetInterfacedoes not extend\Drupal\Core\KeyValueStore\KeyValueStoreInterface, so for example there are no methods to delete an item from this key value store, is that intended? If so, why? :)Why do we need this empty field item list class? Can it be removed?
Like the point above, we need to make sure that we bring all the fixes that went into the computed field from Content Moderation back here.
Is there any reason for the 'ws.' prefix?
Blacklist of what? :)
Do we really need two methods for this? The first one seems pretty useless to me..
For future reference, the TODO list so far is: #50.1, #50.2, #50.4 and #50.6.
Comment #53
timmillwood#521, #52.2, #52.3 - I guess it's obvious I just copied the entity from Content Moderation. I have been thinking, maybe we should take another look at uninstalling base fields (which I can't find the issue for),
Comment #54
timmillwoodBeen playing around with uninstalling base fields in #2905944: Uninstall modules which define base fields so we can remove the ContentWorkspace entity and just add a workspace base field.
Comment #55
timmillwood#52.4 - Yes!
#52.5 - To denote it as a local (non-replicating) document, something inherited from CouchDB, maybe we can remove this and have something in contrib http://docs.couchdb.org/en/2.0.0/api/local.html, or maybe we do need it, as ReplicationLog entities belong to workspaces, but don't get replicated to other workspaces.
#52.6 - It could just be a string ID field, it's based on the CouchDB replication ID http://docs.couchdb.org/en/2.0.0/replication/protocol.html#generate-repl...
#52.7 - The initial plan was to move everything from key_value modules directly into core, thus the flexibility was good, but it was decided to put it into Workspace, so maybe we can make it less flexible.
#52.8 - This is inherited from key_value module https://github.com/dickolsson/drupal-key_value/blob/8.x-1.x/src/KeyValue... but I guess it should extend KeyValueSortedSetInterface.
#52.9 - This is an inherited feature from Replication module https://github.com/relaxedws/drupal-replication/tree/8.x-1.x/src/Plugin/...
#52.10 - yes
#52.11 - Just so we're not setting a single integer as a cache context (I guess) it's inherited from Multiversion https://github.com/dickolsson/drupal-multiversion/blob/8.x-1.x/src/Works...
#52.12 - Black list of entity types that don't / can't belong to a workspace. So a workspace can't belong to a workspace.
#52.13 - I guess we can do away with the first one.
Comment #56
timmillwoodDocumenting more details around the replication tagged services and the upstream plugin.
Comment #57
xjmThis is a new feature, also major.
Comment #58
timmillwoodInitial work to remove the ContentWorkspace entity and move to a WorkspaceReference field, this field will be uninstallable after #2282119: Make the Entity Field API handle field purging.
Comment #60
timmillwoodMissed new files from the patch.
Comment #62
timmillwoodHaving some issues trying to fix the failing tests, but thought a progress patch might be useful.
Comment #64
timmillwoodLet's try that again!
Comment #66
timmillwoodAdding some inline comments for hook_query_TAG_alter, hook_views_query_alter, and hook_entity_load implementations, to try and explain what we're trying to do.
Leaving as "Needs work" because only comments have changed, the outstanding issues / test fails still exist.
Comment #67
timmillwoodMainly a progress patch, but removing last mentions of the content_workspace entity, and trying to progress some of the test failures (even if it's just moving them to fail at a later part in the test).
Comment #69
webchickHey, folks! The product management team took a look at this in today's meeting, and posted our initial, unfiltered feedback (which still needs prioritization and/or validation) at #2910673: Workspace UI usability review.
In general, the MVP is solid and all the concepts are fully fleshed out. However, let's make sure we start the dialog early on how to resolve some of these identified UX challenges, so that they don't crop up as blockers at the very end.
We can meet in Vienna to go over this some in-person so we're all on the same page about what needs to be fixed up, and by when (e.g. before commit, before stable, etc.).
Keep up the great work! :D
Comment #70
timmillwood@dixon_ @amateescu and I have been discussing workspaces this morning and the main point which hasn't been clear so far is a specific revision can belong to one or more workspaces. So if we go down the route of a 'workspace' base field, it'll need to be able to handle multiple values.
It's also worth noting that replicating content from one workspace to another will still leave the source workspace usable (it won't be archived) and replication will work slightly differently depending on if you're replicating to the default/live workspace compared to another workspace. When a revision is replicated to live it then becomes the default revision and belongs to all workspaces. When replicating to another workspace the target workspace id will just be added to the workspace field, updating an existing revision and not changing the default revision.
Comment #71
catchI don't see how you could get to this situation via the UI - is this if a change is copied from one workspace to another?
Comment #72
amateescu commentedIn a dev -> stage -> live workspace scenario, this would happen when you pull changes from dev into stage, because the revision will now exist in all three workspaces.
This basically means that we're implementing the Git logic, while CPS would've discarded/archived the dev workspace instead.
Comment #73
timmillwoodIn the Workspace discussions at DrupalCon this morning we discussed to run with the content_workspace linking entity for now. So this patch is a combination of the patch from #56 and the interdiff from #66.
Comment #75
timmillwoodFixing issues with:
- missing index in quickedit
- urls not as string for destination in WorkspaceListBuilder
- Coding standards in workspace.module
Comment #77
timmillwoodFixing mistake in quickedit.module
Comment #78
amateescu commentedI was talking to @timmillwood today about the next steps for this patch and these are a few points that came up:
- we need to drop the custom toolbox stuff from the patch and use the settings tray implementation
- we should consolidate all those separate workspace forms into a single form
- we need to document better all the "why"s in the code, not only the "what"s (for example, we want something like: "why is this code doing X", instead of just "this code is doing X.")
- we need to write proper integration/kernel tests for all the critical parts of the module, which are the three entry points where an entity can be swapped to a different revision: "views query alter", "entity query alter" and "hook entity load"
- we need to make sure that when we display a view of entities, only two of those entry points are reached: "views query alter" and (maybe) "hook entity load", because right now "entity query alter" fires as well
In the meantime, this patch uses
loadRevision()a lot, which means a separate database for each revision load, so I opened #2912562: Add support for loading multiple entity revisions at once to improve the performance in this area.Comment #79
dixon_Here are the general notes that I took during the Friday code sprint in Vienna when meeting together with Andrei and Tim:
Edit: Annotated headings and lists so that it's easier to refer to.
Comment #80
catchThe title reads like a meta issue, but there's a patch here, so re-titling.
Comment #81
andrewmacpherson commentedAre the animations from #32 and Dries' workflow initiative blog update still on the cards?
Assuming they are, it would be great if a user could turn them off. Over at #2316205-72: Provide a way to disable animations for a11y I'm proposing that we use the new
prefers-reduced-motionmedia query coming in CSS level 5, so animations are progressive enhancements which respect OS/browser accessibility preferences.Can the workspaces UI be designed to work without animations?
WCAG 2.1 introduces "Success Criterion 2.2.9 Animations From Interactions." Notably this is at WCAG level AAA so it's NOT a blocker for our a11y gate. It is however a very-very-nice-to-have.
Update: The accessibility plan/policy for supporting prefers-reduced-motion is being coordinated at #2928103: [policy, no patch] Use "prefers-reduced-motion" media query to disable animations now.
Comment #82
amateescu commented@andrewmacpherson, yep, that is still the UI that we want to end up with eventually.
However, for the initial patch that is being targeted for 8.5.0, we will skip the fancy stuff and just provide a minimal UI for switching between workspaces, most likely in the form of a simple block.
Right now we're hard at work on the "backend" part of the module so when we're ready to post patches again for the core review process we will also open all the necessary followups, including the one for the fancy UI, and link them back here.
I now see that we forgot to update this issue in quite a while, so for everyone who wants to follow the active development of the 2.x branch of the Workspace module (the one that we're targeting for inclusion in Drupal 8.5.0 as an experimental module), it's being tracked in this meta issue: #2916089: [meta] Workspace 8.x-2.x
The TL;DR version of that meta issue is that we're adding lots and lots of test coverage and try to simplify the code and also bring it to core quality standards.
Comment #83
amateescu commentedAfter a couple of months of intensive work in the 8.x-2.x branch of the contrib module, we think we're ready to start the core review process again.
The major changes since the last patch that was posted here in #77 are:
If anyone is interested in more details about these changes, they can be found in #2916089: [meta] Workspace 8.x-2.x and all its child issues.
This patch depends on two other core patches, #2929835: [regression] Modules can no longer alter the the table queue of a \Drupal\views\Plugin\views\query\Sql query object and #2885469: Regression: manually setting the ID field for newly-created content entities is not possible anymore (public follow-up to SA-2017-002), so I'm posting a "combined" patch as well to see if we have any unexpected test failures.
Needless to say, the "for review" patch is now ready for architectural reviews! :)
Comment #85
amateescu commentedI think I got all of them, not sure about
InstallUninstallTestbecause that test refuses to finish on my laptop..Comment #86
dixon_I just want to thank the Workflow Initiative team, @amateescu, @timmillwood, @jeqq and everyone else who has peer-reviewed and challenged all of our ideas along the way! Thank you all!
When reviewing this patch that Andrei just posted, it's difficult to immediately see or understand the vast amount of iterations this solution has gone through over the past year or so, starting with the 8.x-1.x branch of the contrib modules almost 2 years ago...
Kudos to everyone! :)
Comment #87
amateescu commentedA few updates today to tackle some UI review points from #2910673: Workspace UI usability review:
Comment #89
amateescu commentedFixed that problem. When the module is being uninstalled and all the workspace entities are deleted, we don't have any active workspace anymore.
Comment #90
borisson_I'm going to post mostly nitpicks and question - sorry if that is not what you're looking for right now.
There's 4 coding standards errors introduced by this patch, all of them are in WorkspaceResourceTestBase, see https://www.drupal.org/pift-ci-job/838505
This is a review of #85. I noticed a couple of
@todos in the code, they don't have followups, are they supposed to be fixed in this issue or at a later point?I don't understand why this is needed, this doesn't look like it should be part of this patch.
setHistory isn't really the correct name here, I think? It doesn't set history, it appends history. So, addHistory? I'm also not sure why we wrap $history in an additional array. That doesn't feel right, maybe inline documentation can help with that though.
We already do an early return, so this else is useless. There was also discussion about disallowing assignment in an if-statement in the coding standards queue recently.
This should do the same, but looks more readable to me, but that's purely preference, feel free to ignore.
This does not conform to the interface, the interface says that this will always return something, but it looks like this doesn't. We should either change the interface or change this to always return a handler.
We can make the docs more specific here, this is always an array that contains one item, the current user id.
Constructs a new EntityAccess instance.?This can be wrapped closer to 80 cols.
I'm not sure if the intermedairy variable
$workspaceis needed. But if it is, the#urlshould also use that variable.should be only one period.
This doesn't apply to the interface I think, it looks like this function can return void.
Comment #91
amateescu commented@borisson_, we want to get as many eyes as we can on this patch, so every bit of review helps! So thank you for starting it! :)
Those 4 coding standards errors are actually false positives, not really sure how that can happen tbh. This is the code in question, and it's perfectly valid according to our CS. The errors are on the 4 lines with the
break;statements.A few of them are already tracked in the issue queue of the contrib module, for example we still have #2927169: Fix views queries open for finishing the views query alter implementation, #2931700: Allow workspaces to be deleted for allowing workspaces to be deleted, etc. So I would say that all the
@todos will be handled in this patch, by working through the open issues from this list: https://www.drupal.org/project/issues/workspace?version=8.x-2.x-dev\Drupal\Tests\rest\Functional\EntityResource\EntityResourceTestBase::testPost()wants to check the creation of duplicate entities (search for// 500 when creating an entity with a duplicate UUID.in that class), entity types which don't have their IDs generated automatically by the storage layer will raise an exception because the value of their ID field isNULL.Live) workspace usually does not have a "parent"/upstream workspace. Fixed the documentation in the interface.Comment #92
borisson_What we do for facets when no facet source is available is something similar, I think. http://cgit.drupalcode.org/facets/tree/src/Entity/Facet.php#n720, we don't save that object but returning at least an object there makes sure the rest of the code doesn't have to do constant null checking.
So I guess returning a default, with default values is a good idea.
Not sure about those coding standards, it looks solid to me as well. Should we open a new issue in the coder queue for this?
Comment #94
amateescu commentedThis fixes #90.10 by making the workspace manager always return an entity object for the active workspace.
As for the bogus coding standards errors, yep, an issue in the code queue sounds good :)
Comment #95
dawehnerThis review was for #90, I hope not too much changed
🔧 What about adding a @see to the interface, which seems to be the central place where people can read more about it.
❓I'm curious whether we can improve the name. Maybe something like content_workspace_assosiation. The name, at least for me, doesn't communicate its uses.
👍Nice!
❓Should these fields actually have some constraints ensuring that these values are valid? Maybe a constraint on the entity level would be needed.
🔧 It would be nice to communicate what this is used for
🔧 The actual method appends entries, so what about naming the function: appendHistory
I'm curious, does it really makes sense to have a static method on an interface, given you'll always call it without an instance?
🙃 I love this typo: "Hey I need some sleep. I just worked a space ton of stuff."
❓This is more than the usual machine name stuff, any particular reason for that? Maybe adding some comment simply clarifies that.
❓I'm curious, could we leverage the plugin collection code out there?
... This owner permission makes things cacheable by user. I don't see this reflected yet. See #2809177: Introduce entity permission providers and related PR on the entity github issue.
❓So do I understand this correctly that this introduces three permissions by workspace, which itself is a content entity, potentially used per editor or even more? It feels like the permission system is not designed to handle that amount of permissions to be honest. You could also add concerns for adding content dependencies in the user.roles.* configurations.
It is a bit counterintuitive that for aggregations we just need to add the joins, but no further expression. Maybe some documentation helps to clarify this.
What about adding a
activeWorkspaceIsDefaulthelper. This seems to be useful not just here.❓Could we check here whether we have a non default workspace and otherwise return to the core implementations?
I'm curious whether we could send along some helpful message here. On top of that I'm also curious: Is a 400 exception, aka. a client error, semantically really the right thing?
❓There seems to be a reason why this method is not on the interface. For me at least though is not obvious, given that its actually called directly.
Maybe comment why this cache clearing is needed.
Theoretically a anonymous user might have a session. I guess its a fair limitatation to not allow anonymous users to access different workspaces. Otherwise page cache might be way more tricky.
Is tempstore really the right place? I would have put it directly onto the session, given that this plugin talks about sessions in the first place. Some nitpick about the method names. What about adding 'active' to them?
Why is this not using the
isRemotemethod/annotationLet's use sha256, otherwise some people will complain :) Note: I don't really see a reason for the * 10000, it feels like you basically want to generate a random unique string. Don't we have better helper methods for that?
❓This seems to not protect against source and target being both remote workspaces, but both the same
As written in the other place, we need additional cacheable dependencies ...
Shouldn't the workspace negotiation inform the system how they determined this value? For example the session one depend on the current active session.
Should we key this value by the entity_type->id() as well, as otherwise this array just grows over time?
Good question. Maybe access checking should be done explicitly using its own method?
Is there a follow up to allow deploying using a REST api?
🔧 You could also use
$this->assertSession()->statusCodeEquals(403)👏
🙃 These two patterns look totally different but do the same.
I didn't had yet the mental capacity to review this.
👍 I really like this permission name
Given that its plugin IDs depends on content entities (which is a bit weird to be honest), I think we should try to use a different cache bin, given that otherwise we might have a lot of entries in the APCU cache. There are various issues out there related with that.
❌ This should be 'workspace' instead of 'cron'
Comment #96
amateescu commentedWhoa, that's quite an awesome review! I fixed most of the points in this commit from the 8.x-2.x branch of the contrib module, but ran out of steam for now so I'll post a more detailed reply and a new patch tomorrow :)
Comment #97
jhedstromThis is looking really great!
One question/comment:
Can the heavy lifting in these hooks be moved to a class (or classes, as makes sense), similar to the way
workspace_entity_accessis already doing? Thecontent_moderationmodule (and I thinkworkflows) does this extensively...That way, all these private functions could be moved into the respective classes as protected methods, which will make for easier testing...
Comment #98
sam152 commentedThis is looking awesome, huge thumbs up to the team who worked on this!
It has come up before that adding @internal isn't currently enough to fully allow this to change/be removed, if that's the intention. Maybe worth a link off to https://www.drupal.org/node/2917276?
This should have "max_length" set to EntityTypeInterface::ID_MAX_LENGTH. This becomes a huge pain to write an update hook for and is important if you start adding indexes to these fields.
We can check
toUrl('collection')->access()instead of using the permission.Is there a reason Workspace::load isn't good enough here? Would this help with the hinting too?
Can we use $entity_type->entityClassImplements() here?
Recently reviewed #2835542: Utilize service decorator instead of taking over entire EFQ service. I wonder this should attempt to decorate whatever is already in the container instead of extending the existing class. Two modules trying to replace this service does rare enough to not be concerned about this for now.
Are there any use cases which might depend on anonymously owned workspaces? Are we doing this so user "0" never owns a workspace? Would would be the impact of that?
It would be nice to be consistent with hook bridges, but 100% not a blocker.
There was an issue somewhere related to formalising the use of properties on entities stored for a single request, but I can't seem to track it down.
The views stuff easily feels like the most complex part of the module. I wonder if there is a way to pull this out into some hook bridges or a views.inc file to make it a bit more manageable. It seems like testing this thoroughly could also be such a huge undertaking, that it's almost worth delegating to a follow up?
Comment #99
amateescu commented@jhedstrom and @Sam152, thank you for taking a look as well!
Let's get the easy stuff out of the way first :)
Re #95:
1. Sure, added the usual
@seementions to that annotation class.2. Yup, we wanted to rename this entity type for quite a while, but let's figure out what's the best name for it because there's a lot of code which needs to be updated.
Your suggestion is
content_workspace_assosiation, and we were thinking aboutcontent_workspace_linkorworkspace_content_linkbecause that signals a bit better the fact that this entity type is what links a piece of content to a workspace.Any other suggestions are welcome!
3. Nice and clear ;)
4. I don't think those fields from the
content_workspaceentity type need either individual nor composite validation. There is no form exposed for that entity type and it is excluded from REST support which means there no code path which would callvalidate()on it.5. Of course, added more documentation to
ReplicationLogInterface.6. That method was already renamed in #91 to
addHistory(), but I like your suggestion better so renamed it again toappendHistory().7. I think it makes sense to have that static method on the interface because if someone swaps the entity class and doesn't extend the existing one, they still need to implement the interface, including that method.
8. How did you spot that?! That typo was sitting there since December 2015 :D Fixed!
9. Just discussed this point with @timmillwood earlier today and he told me that initial reason for that unusual regex validation was to conform to the CouchDB specification for allowed characters in a database name. However, it makes more sense to use the more standard Drupal machine name pattern since the local replication that is being added to core is not really tied to the CouchDB protocol in any way.
10. Yup, we're actively working on updating that part of the code in #2931067: Add a ContentRepository config entity so we can store fully configured repository handlers, which can be reused for the upstream values of a workspace, and the new config entity type does use the plugin collection pattern.
11. Right, fixed :)
12. You are right, and just above the code you referenced there's a comment saying that the current permission model is designed for a handful of workspaces and that it doesn't scale to dozens of workspaces. I'm not yet sure what we can do about this, maybe we should have a workspace_permission submodule (or in contrib) which provides a different model than the existing "3 permissions per workspace".
However, adding content entity dependencies to user.roles.* configuration is a very good point. We are actively discussing if we should change the Workspace entity type back to a config entity, like it was initially. Stay tuned for an update on this :)
13. 15. and #98.6: The entity query stuff will be handled together in a followup comment/patch.
14. Heh, I had the same thought and I added a
isDefaultWorkspace()helper method in #94.16. You're right, a 4xx status code is not correct for that specific problem, changed it to a 500 status code and added a message describing the problem.
17. There is no reason for that method to not be on the interface, so I moved it there :)
18. That cache clearing was totally not needed, removed!
19. Right, I removed the part of the comment which was mentioning sessions.
20. Yes, I think tempstore is a good place for persistent storage of the active workspace. Imagine a content editor logging out at the end of the working day, and then logging back in the next morning. Storing it in the tempstore instead of the actual PHP session allows them to get back instantly to the same editorial state from the previous work session.
21. We don't use
isRemote()there because there might be multiple "local" implementations of a repository handler, for example "local_couchdb" or something like that, and that code only wants to remove the workspaces from the local Drupal installation.22. Yeah, I also thought that was weird, I just didn't get to fix it until now. The CouchDB specification says that the session_id of a replication log entry is an UUID, so I simply changed the field type and implementation to use an actual UUID :)
23. That's because the source is *always* a local workspace.
24. Fixed as well by adding
cachePerUser().25. I don't think it needs to do that. Theme negotiator have a very similar pattern for the negotiation and yet
\Drupal\Core\Cache\Context\ThemeCacheContext::getCacheableMetadata()return an emptyCacheableMetadata()just like we do.26. That array can't grow over time because the first check that we do in that method is a
in_array():)27. Not sure what you mean there, where would that method live and what would it do exactly?
28. Yes, we have #2926472: Add test for basic remote replication open for that.
29. Sure thing, fixed!
30. ;)
31. Right, changed the code in
workspace_entity_load()to match the others.32. It's actually a pretty simple concept once you read the code and the comments a few times. But I'll let you get to it in your on pace :)
33. Me too :P
34. This will also change with #2931067: Add a ContentRepository config entity so we can store fully configured repository handlers, which can be reused for the upstream values of a workspace, and there will be way less repository handler plugin definitions.
35. Oops, copy/pasta fail :) Fixed!
Re #97 and #98.8 and .10: Sure thing, we'll change that code to follow the hook bridge approach in a followup comment/patch.
Re #98:
1. This entity type is way less risk-prone to "intrusion" than the one from Content Moderation because there is no user-facing code which can interact with it, everything happens behind the scenes. While in CM, a user can actually "act" on the
content_moderation_stateentity type through themoderation_statebase field from each moderated entity type.Since we're still in pre-alpha (actually, pre-initial commit stage), we'll need to discuss whether we want to keep the "link" entity type approach or move it to custom database tables before we reach beta status.
2. Good spot! Fixed.
3. Sure, good idea. Fixed.
4. Because I wanted to use the injected services more. In that form we don't really need the entity type manager for anything else so I changed the local variable to simply store the workspace storage object.
5. Of course :) Fixed.
6. See above, will be handled in a followup comment.
7. Yup, we're doing all that just so a workspace is never owned by the anonymous user. I'm not really sure of the impact of that scenario, but I simply wanted to don't have to think about it :P
8. and 10. See above.
9. That would be super nice to have, let's hope we find that issue at some point :)
In conclusion, the following points were left for a followup comment/patch:
#95: 2, 10, 12, 13, 15, 27, 28, 34.
#97 and #98: 6, 8, 10.
Comment #101
dawehnerI'm actually fine with workspaces being content entities, I just don't believe in adding permissions by default being useful. I think there is an obvious space for contrib modules/custom workflows providing more features. Note: it is not impossible to actually have content entities dependencies (see block_content module as example), though, as far as I understood it, workspaces are something you would throw away after you've done a specific bit of content work.
These kind of ideas (how should editors actually use it) would be ideally documented already, maybe in the issue summary, or on the handbook page.
👍
Good point. Maybe document this particular usecase in the code? Not sure whether this is too specific though.
You know, this could be simply a bug?
Comment #102
amateescu commentedOk, I've added a @todo in my list to open an issue to discuss whether we want to keep the per-workspace permissions or not :)
Good idea, I'll do that in a follow-up patch.
I guess you're right, another @todo point added :)
In the meantime, this patch should fix the failures from #99 and make that access checking code a bit more readable.
Comment #104
catchThis isn't necessarily true if you wanted to support the following:
- 'revert' to the state of a previous workspace (which I don't think core should support but contrib CPS does)
- 'preview/review' of the site in a previous workspace state (ditto)
- diffs between old workspace states relative to each other (more or less ditto too, not sure if CPS supports that though)
Regardless of that though, you'd have to actively delete a workspace unless we auto-delete after publishing to avoid them accumulating, and even if we do it makes sense to me to remove the per-workspace permission from the initial patch.
I have this high up on my list to review - I have been discussing workspaces in a fair bit of depth on and off with amateescu though prior to the patch being here.
Comment #105
amateescu commentedOk, removed the per-workspace permissions for now and documented why we use the private tempstore in the session negotiator. Also, rebased the dependencies to include #2885469-93: Regression: manually setting the ID field for newly-created content entities is not possible anymore (public follow-up to SA-2017-002), which should fix the test failure from #102.
Comment #107
amateescu commentedHere's a new update with the following updates:
which fixes these review points: #95.2, .13, #97 and #98: 8, 10.
#95.12 has been fixed in #105.
What's left to do:
#95.10 and .34: #2931067: Add a ContentRepository config entity so we can store fully configured repository handlers, which can be reused for the upstream values of a workspace
#95.15 and #98.6: #2934352: Consider using a service decorator for the entity query alter
#95.25: #2934354: Expose cacheability metadata in WorkspaceCacheContext
#95.28: #2926472: Add test for basic remote replication
At this point, all the reviews so far were either addressed or have dedicated followups, so we're ready for more eyes on the "for-review" patch! :)
Comment #108
sam152 commentedI found the issue mentioned in #98.9.
Comment #109
amateescu commented@Sam152, thanks for that link, following for now :)
#2929835: [regression] Modules can no longer alter the the table queue of a \Drupal\views\Plugin\views\query\Sql query object was committed so we no longer need to include it in the combined patch, and #2885469: Regression: manually setting the ID field for newly-created content entities is not possible anymore (public follow-up to SA-2017-002) changed drastically so we have a new blocker: #2935076: EntityResourceTestBase::getNormalizedPostEntity() wrongly assumes that all entity types have auto-generated sequential IDs. However, the change needed for the "for-review" patch is minimal, as can be seen in this interdiff.
Comment #110
catchJust lost a long review as I was posting it, here's a second attempt, hopefully remembered most of the points.
Do we actually create the 'live' workspace as a config entity? If not why is this necessary? If so why not hard-code it somewhere? At the moment this means a manual install and uninstall isn't possibly no?
Why is this necessary? It's taking the abstraction of repositories but still embedding the concept of local workspace deployment into the API and makes me wonder about the abstraction a bit.
c&p error, it's not a test entity.
The comment doesn't make it clear what this is. From reading down the patch, it's for partial deployments. For local deployments we should be doing everything in a single transaction anyway, in which case a partial deployment is impossible.
Again there is support for both local and general workspace handlers in the API, why is this necessary? And if it is, why are local workspaces handled by the same system?
'this workspace is for' seems odd, wouldn't it be 'in this workspace' or 'associated with this workspace'.
Can we remove this?
Hello old friend..
Really surprisingly tidy.
Now 'target'.
Same here with upstream/target.
This seems a bit overkill, there's not much we store between sessions.
See questions above - why session ID instead of unique ID, and why is sequence ID necessary?
Why the logging here and not using the logger?
Discussed on slack, but this is probably worth splitting out to its own dedicated plugin.
This should happen in a transaction so we can roll back if any of the entity saves fail. Additionally there currently appears to be no protection for saving items in a workspace over more recently updated entities on live - that could happen either via a pre-deploy audit or when saving but seems like we need one or the other?
Why do we need to know here?
Comment #111
amateescu commentedThanks for reviewing, @catch!
Let's get rid of the easy stuff first. Re #110:
1. Yes, we create a content entity for the
Liveworkspace, seeworkspace_install(). The reason for that is so that users can change and/or translate its label. I did think about hardcoding the default workspace in its storage handler, but I just didn't get to actually do it so far. Would you prefer if we do that instead of creating it inworkspace_install()?And the module can be uninstalled, there's even a test for that:
WorkspaceUninstallTest:)2. and 17.: The 'remote' definition key was meant as a helper for the repository handler base class to provide a default configuration form element for the endpoint URL for remote targets, but @timmillwood also didn't like it very much and now I realize that this could be handled better with an abstract
RemoteRepositoryHandlerbase class. Removed the 'remote' definition and its getter.3. Fixed.
4. Updated the description text to mention that the field is only useful for remote deployments.
5. Adding the
getLocalRepositoryHandlerPlugin()method is a change that I rushed in at some point. I agree that it's wrong and will be removed in #2931067: Add a ContentRepository config entity so we can store fully configured repository handlers, which can be reused for the upstream values of a workspace.6. Fixed.
7. Yes, we can remove that comment since we no longer provide so many per-workspace permissions. Fixed.
8. Oh hai! Long time no see..
9. Took a while to get to that tidy code, but yes, EQ is much easier to deal with than it was in D7.
10. and 11. Are you proposing to rename the
upstreamfield from the workspace entity totarget? I guess I'm ok with that, just need a confirmation :)12. Do you mean that it's overkill to use the private tempstore instead of the actual PHP session? I thought the comment clarifies why we chose that approach. So that editors can come back to work the following day, login and see the site in exactly the same "state" they left it in the previous day.
13. It is called
session_idto match the CouchDB spec (http://docs.couchdb.org/en/2.1.1/replication/protocol.html#retrieve-repl...) but it is actually an UUID, see the bottom of\Drupal\workspace\Plugin\RepositoryHandler\LocalWorkspaceRepositoryHandler::replicate(). And sequence ID is not necessary for local deployments, it's only there to have a unified "logging" facility for all kind of deployments.14. That's because the
ReplicationLogentity type provides us with a well-defined schema for logging deployments, that we can for example inspect with an entity query, while core's standard logger does not have this capability.15., 16. Both points will be handled in #2931067: Add a ContentRepository config entity so we can store fully configured repository handlers, which can be reused for the upstream values of a workspace. That's the issue where we want to clarify and finish the API around repository handlers.
After fixing/clearing up all the other review points above, would you agree that #2931067: Add a ContentRepository config entity so we can store fully configured repository handlers, which can be reused for the upstream values of a workspace is the only (architectural) beta blocker for the module?
This interdiff also contains a change requested in #2910673-6: Workspace UI usability review, and restricts the upstream/target workspace to
Livefor now in the UI.Comment #112
catch#111.1 - I think it makes more sense to hard-code it, the label could still be translatable via string translation. Live isn't really a workspace at all, so better for me for it be as unlike a workspace as possible, it's really another name for 'none' in select lists and etc.
#111.10-11 - I think target is right, or at least it's much closer to what's happening than remote if we're sticking to git-centric terminology. Not sure if the UX team has seen this though - to the extent that the terminology is exposed in the UI it'd be good for them to see it.
#111.12 - to me it seems like overkill, I might not want to go back to exactly the same workspace if I've logged out, in which case I'd need to go back to live. Or maybe someone else has finished working on the worspace and published it in the interim, and I'd get redirected to a 403 after login.
#111.13 - uuid seems better. Couch DB presumably doesn't have to worry about confusion with PHP session IDs.
#111.14 - why is that any different to other systems using the log component though? We don't log anything for edits on the live site, but we do have the full revision history, which we'll still have here. It feels like a feature that's awkwardly between revision logs and the logger.
#111.15-16 - following that issue now, when we discussed that on slack I wasn't sure about the relationship between the plugin and the config entity (seemed inverted compared to our usual pattern for menu links etc.) but haven't reviewed code yet.
I didn't spot other major architectural issues but really need another look through.
Comment #113
dixon_Regarding #112.13 and #112.14:
The case for ReplicationLog to exists as a purpose-built system is because it's an important piece for a generic replication system to work. For example, in the CouchDB replication protocol these log entities are exposed over an HTTP API (as implemented by the Relaxed module in contrib) and used by clients to optimize replications, and figure out from where to start/continue/stop a replication. These ReplicationLog entities contain useful information about the sequence numbers etc.
One can easily argue that ReplicationLog entities are primarily useful for remote replication (as local replications are relatively straight forward). However, in order to design generic interfaces in a way that makes sense I would argue that we should keep ReplicationLog for this purpose. For example
RepositoryHandlerInterface::replicate()returnsReplicationLogInterface. That makes sense, and provides the foundation for contrib to build remote replications on top. Remote replication is the primary use case that is funding this initiative.Regarding the name
session_id, that's a CouchDB-ism and I'm fine with renaming this to something more generic that makes sense for Drupal. The serialization for CouchDB compatibility over HTTP can map this in contrib.@catch All this said, I don't want this to hold up progress on getting workspaces into core. So if my arguments are not convincing enough, let's figure out what the interfaces would look like without ReplicationLog. Then we can find ways around this in contrib later :)
Comment #114
James Marks commentedRegarding #45:
We solved this problem on our installation by injecting a workspace class in the body tag and have provided this solution as a feature request.
https://www.drupal.org/project/multiversion/issues/2933302
Comment #115
amateescu commentedA small update with fixes for these review points from #110: 10, 11, 12, 13
- Rename 'upstream' to 'target' and remove the custom widget and validation constraint
- Use the PHP session instead of the private tempstore in the session workspace negotiator
- Rename the 'session_id' field from ReplicationLog to 'session_uuid'
- Clean up all the functional tests
For #110.1: I've been trying to remove the Live workspace in #2935780: Remove the concept of a 'live' default workspace but I'm not sure that it's possible.
#2885469: Regression: manually setting the ID field for newly-created content entities is not possible anymore (public follow-up to SA-2017-002) was committed in the meantime, which means we only have one more (small) dependency for this patch: #2935076: EntityResourceTestBase::getNormalizedPostEntity() wrongly assumes that all entity types have auto-generated sequential IDs
Comment #116
catch@dixon_ #113
To me it feels like a feature that's specific to replicating workspaces remotely, which is not really generic (it might be generic for remote replication).
Is there a particular reason that the log couldn't be set as a property and returned via a getter? Then it'd be straightforward to extend the interface in contrib to add that getter.
Also a general question on remote replication. Are workspaces replicated as-is (i.e. as pending revisions attached to a workspace) then 'published' using local replication? That's the only way I could think to rule out network issues or similar interrupting mid-replication and leaving the live site in a partial state.
@amateescu #111.
I'm not sure how #2931067: Add a ContentRepository config entity so we can store fully configured repository handlers, which can be reused for the upstream values of a workspace affects #110.16 (except maybe touching the same code?), it's an implementation rather than API detail that should only affect the method and maybe error handling.
Comment #118
amateescu commented@catch, yes, it was about touching the same code :)
This should fix the REST test fails, a new field was added by #2891215: Add a way to track whether a revision was default when originally created.
Comment #120
amateescu commentedThose new failures are being handled in #2936704: Remove REST route deprecations from DeprecationListenerTrait::getSkippedDeprecations(), use ExpectDeprecationTrait::expectDeprecation() instead, let's wait a bit for that issue before adding deprecation messages for the workspace routes.
Comment #121
dixon_@catch
I agree it would be nice to hard-core the live workspace. But unfortunately I think this is a bit too hard to achieve. We've already achieved zero performance impact with the live workspace by not joining/querying with it. But the live workspace entity is still needed to not create unmaintainable exceptions in the code everywhere. Tim and I attempted this in the past and failed, and Andrei attempted it recently here: #2935780: Remove the concept of a 'live' default workspace.
So, I would suggest we keep the live workspace entity for this initial patch. This pattern would be similar to how the anonymous user entity is kept in the DB. If we later find clean ways to remove the live workspace entity I'm all for it!
I understand your viewpoint here, and I don't disagree. One can definitely implement the necessary logging in different ways.
I'm fine with removing the ReplicationLog entity from this patch.
Right now (in the first iteration of the contrib modules) they are replicated as-is to the target workspace (wether that's the "live" workspace or another workspace on the remote site). But that's mostly due to specific implementation details in those first versions of the contrib modules. We haven't yet written remote replication on-top of this core patch. The flow you describe here certainly makes a lot of sense for robustness!
Comment #123
amateescu commentedTime for a "small" update here with all the work from the past couple of weeks. Here's the shortlog of the changes:
- Allow workspaces to be deleted
- Adding a new published entity in a non-default workspace does not create two workspace association entries.
- Do not allow concurrent edits in different workspaces
- Delete workspace_association entries when a workspace is pushed.
- Clean up the workspace switcher block and a few tests.
- Small test code cleanup
- Design a proper API for repository handlers.
- Work around the fact that core's REST test suite assumes that all content entity types have serial IDs.
- WorkspaceActivateForm cancel redirects to manage workspaces
- Remove the ReplicationLog entity type.
I emphasized a few changes that I think are the most important. The only @todo from @catch's review is #110.16 :)
We implement a workaround for #2935076: EntityResourceTestBase::getNormalizedPostEntity() wrongly assumes that all entity types have auto-generated sequential IDs so we don't have to roll combined patches anymore, but #2936704: Remove REST route deprecations from DeprecationListenerTrait::getSkippedDeprecations(), use ExpectDeprecationTrait::expectDeprecation() instead is still a problem so this patch will probably have the same fails as #118.
Comment #125
amateescu commentedAnother quick update which should fix all the outstanding review points and a couple of PostgreSQL and SQLite problems. Here's the changelog:
- Fix SQLite support.
- Cache the active workspace in WorkspaceManager
- Decorate core's entity query services instead of replacing them and fix PgSQL support (fixes #98.6)
- Wrap workspace deployments in a single database transaction (fixes #110.16)
- Remove workaround for #2935076: EntityResourceTestBase::getNormalizedPostEntity() wrongly assumes that all entity types have auto-generated sequential IDs
In the meantime, #2935076: EntityResourceTestBase::getNormalizedPostEntity() wrongly assumes that all entity types have auto-generated sequential IDs and #2936704: Remove REST route deprecations from DeprecationListenerTrait::getSkippedDeprecations(), use ExpectDeprecationTrait::expectDeprecation() instead were committed so we're down to 0 dependencies!
Comment #126
amateescu commentedIMO, this is quite ready for RTBC, there's just one pending question: should we rename the module to Workspaces (plural), to match Views, Workflows, etc.?
Comment #128
amateescu commentedMy local branch had a commit which was reverting #2935076: EntityResourceTestBase::getNormalizedPostEntity() wrongly assumes that all entity types have auto-generated sequential IDs, fixed that and also two additional things:
- Fix error translation string
- Remove dependency on the User module
Comment #130
amateescu commentedIt seems that the dependency on the User module is actually required for some reason.
Comment #131
larowlanReview in progress, note to self - up to WorkspaceNegotiator
i don't think this is needed, as its the default
these look to be using the defaults - not needed?
Are we sure we want content entities instantiating plugins. Could this be done from outside the entity?
Is this repeating the design of content moderation? I thought that now we can delete/purge base fields we would not need these intermediate entities anymore?
Is this needed because one content entity revision can live in multiple workspaces?
we should use === here.
I think this is the first instance we have of permissions per content entity. How big is this going to get? Our permissions system is pretty stretched as is.
nit, = not needed
nit, should use === here?
Seems like one workspace per content entity, so wondering about the intermediate entity?
nit, these are strings, lets use ===
this is a fairly broad-brush catch here - what exceptions are we expecting?
can we be more granular here?
This is fairly extreme, can we not handle this with a 404 in the routing system?
Like I said above, I'm not sure its the entity's job to instantiate plugins.
I think a separate service would be fine, or even the plugin manager itself.
$repoHandlerPlugin->instantiateForWorkspace($workspace)
And then change the getRepositoryHandlerPlugin method to just return the ID instead.
same here, very broad exception being caught
This isn't very useful to help debug what went wrong - can we add some more detail about what was being deployed etc?
can we comment why this is needed?
this occurs a few times, merit in having a convenience method on the storage handler or manager?
When would this happen? If the user submitted a form with a missing workspace, the select field validation should prevent this, and show the 'invalid value' error?
same comment about broadness
Comment #132
amateescu commentedThanks for the review, @larowlan!
Re #131:
Workspaceentity type to be the main entry point for pushing/pulling changes. In fact, I made that even more clear by adding thepush()/pull()methods toWorkspaceInterface.WorkspaceAccessExceptionthere, fixed.createFromWorkspace()to the repository handler manager.RepositoryHandlerInterface::push()/pull(). We could introduce a separateRepositorySyncExceptionexception, which would be thrown byRepositoryHandlerInterface::push()/pull()and only catch that one. Would you prefer it this way?\Drupal\Core\Entity\ContentEntityForm::getEditedFieldNames()tells exactly what that method does and when/why you need to override it.WorkspaceAccessException. Fixed.Comment #133
phenaproximaOne of the things I admire most about the Workflow Initiative is that y'all take the most arcane, complicated stuff in Drupal core -- the kind of stuff that makes me want to give up programming and become a dirt farmer -- and write elegant, clear, understandable code to fix it up. It's glorious. I didn't get very far into this review (it'd take a day to read the entire thing), but this is fantastic work. I don't think I found anything serious, and I definitely don't think I'm qualified to RTBC, but I did find a few minor things and have some random questions (forgive me if they were already addressed earlier in this thread).
This regex will match an empty string. Shouldn't that * be a + to prevent that?
Should this be an entity reference?
This seems like it should cache the repository handler as $this->repositoryHandler or something.
Nit: $this->set() already returns $this.
Same here.
And here.
Why aren't the data and revision data tables defined?
I'm probably being overly paranoid, but won't this field type depend on the nature of the referenced content entity? For example, what if it defines its ID field as a string (like the Workspace entity type itself does)? How can we account for that? Should we even account for that, even if it's just by throwing an exception?
On the other hand, Content Moderation uses this same pattern, so it's probably not really an issue. But, I figured I'd mention it anyway. It's just that there's no official guidance (as far as I know) which says that content entity IDs need to be integers, so this assumption could produce some major problems in exotic use cases.
Nit: 'ie' --> 'i.e.'
This seems inaccurate.
This seems like a good candidate for array_filter().
The entity_load hook checked if $this->workspaceManager->getActiveWorkspace() returned a truthy value before calling isDefaultWorkspace(). Why doesn't this method do the same thing? (Although, I prefer this cleaner, more streamlined code.)
This same code has been repeated three times now; maybe it should be a helper method? :)
This seems like it could cause infinite recursion. Why doesn't it? Can there be a comment explaining how that is prevented?
Should probably be moved to a helper method.
I don't understand how comparing revision IDs will determine if the revision is "new". Doesn't RevisionableInterface have an isNewRevision() method for this purpose? Why won't it suffice? Can the reasoning be documented?
Nit: You could reuse $storage->create() here to keep things a little more cleanly encapsulated.
It'd be cool if there was a method on WorkspaceAssociation to set the content entity type, ID, and revision ID in a single call. Something like
$workspace_association->track($entity).Not quite accurate! :)
Comment #134
amateescu commentedThanks for the review (and the kind words), @phenaproxima!
WorkspaceManager, it can use the standard check as well.\Drupal\workspace\WorkspaceManager::shouldAlterOperations()for this repeating code.isNewRevision()there but there was a problem. I don't really remember the details, but the test fails can be seen at #2938944-9: Adding a new published entity in a non-default workspace does not create two workspace association entries.WorkspaceAssociationentity type so far, so I'd wait for a couple more use cases before introducing that helper.Comment #135
borisson_I didn't get trough the entire patch, but I have some nits.
Node and Media also have this, this is the 3th occurance of this exact code + docblock in core, should we open a followup to figure out if it makes sens to move this to a common class, or a trait?
For nodes, this is
t('The time that the node was created.'), so I think we can also do this here:The time that the workspaces was created.Instead of adding the inline comment here, can we just add this in the
@paraminstead?This comment doesn't add a lot of value, I think it can be removed.
This comment is repeated in every method of this class, that smells a little weird. I think it is valuable though, no idea how/if that can be improved.
Mismatch in docs vs class name.
By doing early returns in this method we can make it easier to grok.
docblock vs classname.
The DeletedWorkspaceConstraint has a docblock here, we should either remove that one, or add a similar one here.
Comment #136
amateescu commentedThanks @borisson_ for taking another look! :)
Re #135:
@paramtype hint will not match the actual function signature, and we can't change that one because that's what we receive fromhook_entity_presave()..Comment #137
catchI'm still struggling with this a fair bit. Going to try to approach it from the point of view of use cases rather than architecture to get the issues across.
User A starts workspace A, modifying entity 2.
User B starts workspace B modifying entity 3, this change is pretty minor.
User A decides they want to add an entity reference from entity 2 to entity 3, and they want to incorporate the changes from workspace B so they can see how it'll look together.
User A would then need to either 'pull' from workspace B, or go to workspace B and 'push' to workspace A.
However with the current architecture, both of them have 'live' stored as a target.
So if I want to merge, do I need to edit the workspace, change the target, do a push or pull, then edit the workspace again, and change the target back to live?
It feels to me like merging, publishing to live, and replication content-hub style are really three different operations that could each happen to any workspace - so you could have a 'merge' form and a 'publish' form as different things, rather than dependent on configuration.
With the example of replication I asked about whether it would first replicate as a draft workspace then deploy two live as a two step operation that could take advantage of transactions for the publishing step - this again means the target of the workspace would need to change in order to do that (or switch the target of the workspace when it gets replicated itself?).
I haven't re-reviewed the entire patch, but this is the one thing that I had trouble with the last time I did a significant review and I'm not sure I got it across last time properly.
Comment #138
timmillwoodIn the current patch it's only possible to push to and pull form live, but in follow up issues we will be supporting pushing to and pulling from other workspaces. Although as you suggest this is restricted to the workspace set in the "Target workspace" field. Here we're really looking to simplify the user experience. Where you press the "Deploy" button, or the "Update" button and it just does it. No need to think about what you're doing and where it's going.
We're also trying to follow the example of git where you can set an upstream. However with git, unlike our implementation, you can pass in an argument to push somewhere else. Maybe we need an advanced option to select where to push to, or pull from.
I believe this work is best suited to a follow up after adding the functionality to push / pull between workspaces.
Comment #139
catchYes I don't think #137 is blocking commit - we should open an explicit follow-up to discuss it and figure out a bit more before workspaces goes into beta/8.6 alpha though.
Comment #140
timmillwood@catch - I have opened #2958752: Refactor workspace content replication plugin system to three services to discuss #137 further, and also look at other possible content replication use cases. We can all keep the issue updated with thought, ideas, and use cases we want to consider.
As mentioned there, the patch from this issue only covers replication to and from the
liveworkspace, therefore we already have a lot of work to do on replication, so it'd be good to make sure we cover all use cases.It'd be great to get the patch here committed ASAP so we can start working more actively on the follow up issues, which as you know we have around 3 months to complete if we wish to get Workspace module Beta ready for 8.6.x.
Comment #141
timmillwoodAfter discussing with @gaborhojtsy and @catch on slack this is ready for RTBC (and maybe even commit).
Comment #142
catchNot a full re-review but a couple of things I picked up on. Only the accessCheck(FALSE) is blocking. If I'm able to find time to keep reviewing later I'll keep going, but was sort of hoping to commit this so posting a partial review now in the hope there's nothing else to mark CNW over.
This should have accessCheck(FALSE). I can think of very strange situations like domain module and a revision being attached to a different domain?
!== wouldn't hurt.
Is this logic explained anywhere else? If not it could probably use expansion (can be in a follow-up issue, but it's a tricky thing to explain exactly why it happens like it does).
Comment #143
timmillwood#142.1 - Fixed.
#142.2 - We've seen issues with revision IDs being returns as string or integers, so I think
!=might me more reliable than!==.#142.3 - Added a follow up #2962764: Better explain _initialPublished process.
Comment #144
tstoecklerDisclaimer: I only read the issue summary and the patch, not any of the comments.
Started looking at this and stopped at the first somewhat substantial remark, the previous ones are all nitpick-y.
In general, though, I am not really excited about the approach. The fact that we are hooking into entity_load directly and altering what people will get from there is making me really uneasy. We are already doing something like that for config overrides and that is causing all sorts of problems because then sometimes you do just want the unoverridden config. The same applies here: If I do just want the entity in the default workspace there seems no easy way out. The approach we have taken for content entities and translations is also far from ideal as it requires to call
getTranslationFromContext()in a bunch of different places where you would expect to be just be able to deal with the entity directly. However, it is a fair bit less magic. And the fact that throughgetTranslation()and friends dealing with different translations of an entity is fairly manageable. So I am wondering whether instead of hardcoding this magic into entity_load() whether we shouldn't instead somehow generalize thegetTranslationFromContext()concept. Perhaps it's even possible to re-se it or parts of it as it seems in those places where you want the "currently active translation" of an entity are also those where you want the "currently active workspace variant". And while we can't add methods such as->getWorkspaceVariant()or whatever to every entity, maybe we can introduce some sort of wrapper object, which gets instantiated with an entity and makes switching back and forth easier to deal with. (Adding such an object can certainly be a follow-up, but this is less about the concrete proposal of such an object, but more an architectural criticism.)One other note: The issue summary mentions being able to switch workspaces via query parameter, but I don't see that anywhere in the patch. Did I miss that, or where does that happen? Because it does seem very much related to the above comment.
Here's the (first part of the) Dreditor review:
Usually we put another "/manage" part in the path. The way it is currently, you cannot name a workspace "add".
Minor, but weird to explicitly use the key-value duplication in this data structure. I would have just used an array of IDs directly.
Missing
label_collectionNone of these have any validation constraints. Would be a nice follow-up.
Very unfortunate that this has to bet from the outside. For Config Import we basically set a global in a service during the sync operation so that people can check that from everywhere. I'm not entirely sure if that is better than this flag which at least is in context of the thing that is being updated.
However, I'm not excited about introducing such a workaround here if there is no clear path to a better solution. I.e. a similar behavior was already discussed for migrations in #2329253: Allow ChangedItem to skip updating the entity's "changed" timestamp when synchronizing. Speaking of migrations: What if I am running a migration and not in the default workspace? Will this create new revisions for me then?
Is this missing a check that the entity is not in the default workspace? Looking at the code again, will this not always set a new revision for every entity save regardless of the workspace?
Comment #145
timmillwood@tstoeckler - Thanks for the review.
We decided we didn't want / need to be able to switch workspace via query parameter, so updated the issue summery.
Comment #146
plachIn general this looks great to me, I love how the logic is broken down: the architecture is very clean and easy to follow.
OTOH @tstoeckler makes a good point about hooking into entity loading, however the alternative is to require every single module out there wishing to integrate with Workspace to explicitly support it by leveraging a specific API. This doesn't sound great either, but #144 looks very similar to the API proposed in #2942907: Entity system does not provide an API for retrieving an entity variant that is safe for editing, which we discussed as part of #2940575: Document the scope and purpose of pending revisions. Once we have this new API in place, modules will be required to adopt it anyway, to ensure they are dealing with revisions correctly. Maybe a way forward could be to commit the patch with the entity load approach still in place and add a @todo to revisit that logic once the new API is available.
About #144.5, I'd suggest to open a follow to finally provide an alternative to these horrible dynamic properties and a @todo there. I'm wondering whether we could implement a non-persistent repository of entity objects (maybe leveraging
SplObjectStorage) to store these dynamic flags and attach a different instance of this repo to every request in the request stack. We could also have a persistent variant to leverage in batches, maybe using state or another KV collection.Aside from those issues, here is a code review: I didn't find any blocker, i.e. changes not allowed by our experimental module policy, but it would be great if we could address these before commit.
Why is naming inconsistent here?
Why is naming inconsistent here? Should the former be renamed to
isEntityTypeSupported()or something similar?The method name and PHP doc are a bit inconsistent: what about shouldAlterBehavior() or something like that?
Given that we are always working with content entities, would it make sense to add a check on
ContentEntityInterface::classas well?Shouldn't these URIs be somehow namespaced? E.g.
https://drupal.org/link-relations/workspace/activate-formThe PHP doc and method name seem a bit inconsistent: either we are providing implementation details in the PHP doc or the method name is too generic, in which case I'd go with
markAsPushed()or something along those lines.Maybe it's just me, but those PHP docs look a bit weird: from the constant name I'd expect it to mean the opposite, e.g. in the case of
CONFLICT_DELETE_ON_CHANGEI would expect that an item has been deleted in the target while being changed in the source (where "target" is usually the default workspace, I guess).PHP docs and method names seem a bit inconsistent: if we are only getting a list of differring revisions (and not the actual differences), can we change names accordingly? e.g. getDifferringTargetRevisionIds?
How does the caller know whether UUIDs or revision IDs were actually returned? Wouldn't separate methods with different return values provide a better DX?
Would it make sense to have separate methods to get default entities and all revisions? e.g. getTrackedEntities() vs getTrackedEntityRevisions?
Is the $group parameter actually needed? It seems unused outside tests.
Why we are not using en entity reference here?
At first sight it seems we are unnecessarily coupling the "repository_handler" and workspace concepts. Does this method really belongs here?
Why not defining ::DEFAULT_WORKSPACE on WorkspaceInterface so avoid adding a reverse dependency from workspaces to their managers?
This should be
ContentEntityStorageInterface, I guess.What about
::appliesTo()? ;)The comment is a bit weird: isn't this checking whether the user can actually access all workspaces?
Don't we need a cache clear after deleting those records?
This kind of string concatenation doesn't work with RTL languages: we need something like
t('@label (@id)', $args).Shouldn't the class named
is-default?Normally we put build in names returning a renderable array, what about
workspace_build_renderable_links?Indentation seems off here
This is runtime code, so shouldn't it be moved to
EntityOperations?Shouldn't this be
ContentEntityFormInterface? Otherwise we are unnecessarily processing config entity forms (for a few more lines :)Unless I'm missing something, the
#markupmessage is never displayed, right?Doesn't this need to be processed via a batch?
This is unfortunate: it makes an otherwise storage-agnostic class, SQL-dependent. Maybe we could introduce a SqlLiveRepositoryHandler subclass? Or we could open a follow-up (and a @todo here) to add method to
ContentEntityStorageInterfacethat the SQL storage can implement with a transaction start? Other ideas are welcome :)I don't understand how this is supposed to work: I'd expect to see some check that the workspace is actually deleted before counting its changed revisions...
Won't this complain also in the case where we are editing the entity in the correct workspace?
Anyway, the FALSE parameter is not needed: the signature has only 1 parameter.
Why is this stored as an expression? Isn't it a field?
Can we inject this?
Would it make sense to move this logic to a method in the repository handler?
typo
Let's use
WorkspaceSwitcherForm::class:)$viewseems to be unused.This seems useful information for the caller and could be moved the PHP docblock:
Awesome comment!
These comments do not wrap at column 80.
Can we create a follow-up and reference it here?
I can't be sure by just looking at the code, but it seems to me that this may not support the case where we are joining on the same entity type base/data table multiple times. Am I wrong or do we need to add a follow-up to support that as well?
multiple joins/aliases?
The last sentence is an implementation detail that could be removed or move into the method as an inline comment.
If I'm not mistaken, there is no test coverage for the Views query alteration logic, can we add a follow-up and a @todo to address that?
Can we add an assertion to check that the user has no access before being assigned the role?
The bot has these PHPCS complaints:
/var/lib/drupalci/workspace/jenkins-drupal_patches-56224/source/core/modules/workspace/tests/src/Functional/EntityResource/WorkspaceResourceTestBase.php ✗ 4 more
line 196 Line indented incorrectly; expected 6 spaces, found 8
199 Line indented incorrectly; expected 6 spaces, found 8
202 Line indented incorrectly; expected 6 spaces, found 8
205 Line indented incorrectly; expected 6 spaces, found 8
Comment #147
plachI forgot to mention that I didn't read the whole thread, sorry for any duplicate remark.
Comment #148
fabianx commentedRegarding entity_load (from a CPS module maintainer perspective):
This is different from the config overrides, because you already have a possibility to get to any version you want:
- You just specify an explicit revision to load
If the D8 version here also respects that, we are good, because then you can explictly load a different version, while modules relying on the default behavior will be automatically loading / editing and saving the right version - based on which workspace I am in.
And that makes behavior and isolation-of-concerns wise a lot of sense:
- There is no longer _one_ live site, but several. The workspace just determines which 'database' I use.
---
Unrelated question:
Will other entities get an published / unpublished status once Workspaces is in, so that they can too be workspace-d?
Because we found the true power of CPS shines when you can edit your whole site without worrying that any data is appearing live directly.
I am even right now porting metatag and redirect module to CPS. We already ported a version for a cps_path.
Comment #149
fabianx commentedPartial review (as I have fear of loosing progress).
TL;DR [so far]:
If I explicitly load a different revision, I want to be able to load that revision, but not get it exchanged under my belly. At least I should be able to load a different revision within the same workspace or set a flag to override that loading globally.
This could get very very long.
E.g. on some CPS sites we have 8100 workspaces.
Of course maybe this here is more like an -origin branch, so potentially it would be less if branches are re-used.
If I restrict it to in-progress workspaces I end up with "just" 48 ones, which still will clutter the permissions page a lot.
And it might be even less if the workspace has been merged / published.
(but then an entity field query should restrict this to active workspaces)
Loading all 8100 workspaces here, would timeout the CPS site for sure.
While I am not opposed to the idea to allow workspace based permissions, can this be optional?
Or at least optionally disabled?
Or maybe pushed to a follow-up - especially as it is a pretty dangerous permission.
This is very important as it ensures perfect performance for the default / 'live' workspace.
As hook_entity_load() is indeed called for revisionLoad(), this is a problem:
// When a non-default revision is being loaded, don't overwrite any of the
// contents of that revision.
// When the new drafty dependency is added, drafty module may not actually
// be enabled, and this can cause fatal errors on sites that load entities
// during bootstrap, so check for module existence.
In CPS it is solved with the following check:
Total nit, but could we use $revision_entity instead of just $revision.
I was severely confused by those lines.
So we never update the assignments per workspace, but rather create a new revision.
What is the use case for that?
(except for tracking multiple workspaces on a single workspace_association entity).
E.g. that could be totally a follow-up, but a problem we had long-time with CPS was that you need to cleanup all the historical information at some point.
Here you are creating even more information every time an entity is saved.
e.g. if a content editor edits a content item 20 times within their workspace, then we end up with 20 revisions here (and also in the entity tables / field tables itself).
Maybe this is not too bad, because it could even help with cleanup as we know all entity revisions that need cleanup directly.
Anyway: If there is not yet a follow-up task for cleanup, then it would be good to create one.
It will happen - from our experience.
Comment #150
catch@plach:
#146.7-9+12 (and one or two subsequent ones) I'd expect this interface to go away (or move down a level) after #2958752: Refactor workspace content replication plugin system to three services which is likely to have separate publish and merge services (and a further service for remote deployment). This has been the main thing bothering me about the design of the module, but it's quite a big change so happy addressing it in that follow-up.
---
Agreed with @fabian's point on entity_load() - if you need to work with a specific revision you should be able to load that specific revision without the logic kicking in (both that this is sufficient to address tstoeckler's concern, and also that the implementation should actually do that).
I need to double check the 8.x situation, but that functionality in drafty to determine whether the default revision was requested or not for 7.x was one of the trickier things to do properly:
Agreed on a follow-up to implement purging old workspace information. A big difference between this patch and CPS though is that it does not store the entire history of the site when a workspace is published (because there is no rollback yet, and CPS-style rollback is not planned for core) - so the only records in that table will be for actually-affected-entities which is a tiny proportion of most sites.
For other entities, #2880149: Convert taxonomy terms to be revisionable has been RTBC a few times now. #2336597: Convert path aliases to full featured entities is open for path aliases.
Comment #151
fabianx commentedNext partial review:
Nice!
I think this is fine for now and I believe we should leave revisions as low-level thing that someone may manipulate directly.
In the worst case it means that some contrib modules dealing directly with revisions need to add explicit workspaces support, but given that even in Drupal 7 the only modules that dealt a lot with revisions have all been workflow modules like workbench, cps, revisioning, drafty OR supporting module to enable workflow like functionality like taxonomy_revision, ...
which is now all in core, I am not seeing the case for a contrib or custom module that would query revisions directly AND not want to control the behavior.
Same with loading an explicit entity.
This is dangerous!
If you want to guard based on a value you always store the old value first.
e.g.
Everything else is looking for trouble - e.g. someone else wanted to set all_revisions to TRUE already.
Uhm, I am not getting how that would work to correctly select only the latest revision when an entity was saved twice in a workspace.
Maybe I am misunderstanding how COALESCE works, but I am not seeing how when I save a draft several times it joins to the correct version.
So this is the default conflict resolution process?
I have to see how 'publishing' of workspaces works, but would this not mean that a workspace needs to be deleted for the entity to be editable again?
How would a revert work (CPS feature that is used heavily by our content editors) - if the workspace is deleted?
Still understanding the architecture here.
If I wanted CPS like behavior to select workspace via a '?workspace=staging' URL parameter, I could just create a request listener to set the active workspace at the start of the request, right?
I am here missing the notions of being able to 'archive' a workspace.
Not necessary for the first commit, but more thinking out loud.
As this is created on demand, what happens in the race condition case?
e.g. workspaces is enabled on a live site and several editors are testing after the deployment?
I am okay if this just throws an exception as ID is already taken, was just making sure it would not create a duplicate workspace.
Huh? The other code gives the impression that you can push changes to any workspace:
e.g. several content editors could collaborate on a new feature, then merge all their changes together before those are pushed live.
This violates that?
e.g. If you can always only push live and pull from live, that is fine, but if you allow push to a different workspace than this needs to be changed here.
If I misread the above and there is a check for only pushing to default workspace (for now) I am good though.
Edit: Misread - this is the live repository handler, so this is okay.
Leaving here as others might be confused by the same thing ;).
While this current is not a problem, if we ever wanted to also act on latestRevision() (I get that comment now), we should explicitly specify to override the currently active workspace, so that we compare with whatever workspace we want to compare against (currently default / live):
e.g.
In cps we have:
cps_override_changeset('published');
// ...
cps_override_changeset(NULL);
---
For better code flow I would suggest to use a closure:
That also neatly solves the entity loading problem.
Can be follow-up, but ultimatively you need a way to quickly change workspaces.
We tried to avoid that in CPS at first, but it ended up to be necessary and useful.
Ugh, so this really deletes all old things, so a 'revert' will not be possible.
Potentially a 'revert' workspace could be automatically created (just tracking the old live site IDs before overwriting), but then it would still lead to conflicts, so probably that would need to live in another table, which then would be converted into a revert table on demand.
Tricky, but for our content editors again one of the most important (and useful features).
Sweet - this is exactly a use case for which cache contexts where made.
Great that this makes it easy :).
Comment #152
fabianx commentedThis is a potential race condition as the operation in between could take quite a long time.
It is better to re-get the workspace.deleted state once (unless that is cached statically?) to ensure we minimize conflicts.
Else locking would need to be added. (potentially the better idea anyway)
On a site with 20 content editors it is not too unrealistic for 2 of them to delete a workspace at the same time.
----
Finished review:
Overall: Great work! This will be a very very useful addition to core.
Some little things need work and some questions (especially related to how COALSECE queries work when there is several rows per entity_type / entity_id / workspace - tuples.
But maybe the latest revision per workspace could get a special 'active' or 'latest_revision' flag to avoid that - if I am correct and that is a problem. (I might be wrong, this is just from memory when I dealt myself with COALESCE in cps to fix some tricky bugs).
Or maybe there is some magic already when joining the table that was added in D8 for that ;).
Comment #153
fabianx commentedI was against that implementation anyway ;) - but it was a logical quick way to do it when you only have one table.
However here we have just a workspace and we can easily revert the changes by creating an archived revert workspace when publishing (similar to how a GIT revert works). Probably indeed best to track this in a workspace_revert table.
Which would still mean to delete the original workspace, like it is implemented now. Which means this is fine as is, as that is just an additional operation before the old workspace is deleted. And it can be in the same transaction.
And a revert of the revert could also be created as an active workspace.
Conflict resolution would just need to ensure that if a workspace needs to be reverted that drafts based on that workspace need to be deleted first. (or could also blow away changes from the workspace - the changes could still be restored by using the normal 'revisions' tab).
But then this can be implemented just by overriding some little things.
Therefore the architecture is sane to implement reverts in the future either in core or contrib and conflict resolution can easily be overridden by custom / contrib.
---
Only problem remaining I see with deleting old workspaces is that it makes cleanup difficult:
You don't know where an in-between revision (one of the 20 versions a content editor saved) came from originally, so you cannot cleanup those in-between revisions later - as that history is effectively lost.
Though if you have a revert table then those are just the revisions that are neither tracked by a workspace nor tracked in the revert table, so it can be cleaned up.
Comment #154
fabianx commentedRead @plach's review:
Is that necessarily so?
e.g. I am not sure if menu items are currently config or content entities, but they for sure should be changed as part of a global workspaces concept.
Thought potentially it would be okay to have ContentMenuItem instead for that use case.
For CPS 7.x we ended up creating our own (unfortunately proprietary as it was based on other proprietary module) custom_menu implementation, which was based on entities.
But I believe if we wanted to have menu items that allow that it might potentially be easier to make them revisionable AND add the published interface than to create a completely new type.
Though I might be wrong here.
But I wanted about the plan for menu links regarding workspaces anyway ;).
For me:
Content managed by workspaces == Content Editor can edit it during normal editorial workflows AND changes are directly related to other content
Comment #155
fabianx commentedRe - the temporary entity properties, I think this is tracked here:
#2896474: Provide an API to temporarily associate data with an entity
And I am very much +100 to that as it allows to vastly use more LazyBuilders in core as it is the #1 reason we cannot just lazily load the entity again - as it might (or even has) wild-west properties.
Comment #156
plach@Fabianx:
Menu item links can be YAML, config items or content entities. Content entities cover basically everything that's not config, so it shouldn't be a limitation.
Comment #157
amateescu commentedThanks everyone for the thorough reviews! I'll address each one in a separate comment to make the changes easier to follow :)
@tstoeckler, re #144:
FIrst, as @plach already pointed out, it's very clear by now that we need a generic API for getting the proper revision/language of an entity given a set of context information (current content language, current workspace, etc.), and #2942907: Entity system does not provide an API for retrieving an entity variant that is safe for editing is the issue where this will be handled. My initial thoughts for that API are also expressed roughly in #2126447-139: Patch testing.
\Drupal\workspace\WorkspaceManager::purgeDeletedWorkspacesBatch(), but I'm open to investigate changing this in a followup if you feel strongly about only keeping the IDs as values.workspace_associationentity type is never used in a context which would callvalidate()on it automatically (i.e. entity forms or REST requests), we only interact with it internally within the module.Yes, it should!
We are not missing the check,
\Drupal\workspace\WorkspaceManager::shouldAlterOperations()does exactly that :)Comment #158
lauriii.toolbar-tab should be removed from the selector.
This is not overridden in RTL so I think we can remove the comment as well.
We should rename this class to "workspace-toolbar-tab--is-live" since this state cannot be used on any other elements.
Would be great to get another review from the accessibility maintainers.
Comment #159
amateescu commented@plach, re #146:
getCreatedTime().isEntityTypeSupported(), renamed :)shouldAlterOperations()method name sounds exactly like what other methods should be calling prior to doing any runtime alterations.ContentEntityInterface, any entity type which is revisionable and publishable should work fine with workspaces.markAsPushed()is exactly the previous name of that method, but we changed it topostPush()in because its implementation was no longer about "marking" stuff: #2938852: Delete workspace_association entries when a workspace is pushedSOURCEandTARGETin there, but they would get too long. Are you ok with handling this one in a followup?getDifferringRevisionIdsOnTarget()andgetDifferringRevisionIdsOnSource()to match the othercheckConflictsOnTarget()method.$all_revisionsas an argument on the method, but I'm open to discussing it more in a followup if you feel strongly about it :) And the$group = FALSEparameter is indeed only used in tests, removed it.WorkspaceInterface.ContentEntityInterfacethroughout the code.applies()is used by all the other negotiator interfaces in core.. :)$settings['entity_update_batch_size']. Left a @todo to handle this in #2958752: Refactor workspace content replication plugin system to three services.count()is easy enough to do..$viewargument.Frontpageview in\Drupal\Tests\workspace\Kernel\WorkspaceIntegrationTest::assertWorkspaceStatus(). But the issue above is also about adding more coverage.Comment #160
plachThanks @amateescu!
re #159:
3: I meant that the PHP docs mentions both "operations" and "queries", so I was suggesting a more generic name that could encompass both. Alternatively, we can remove "queries" from the PHP doc :)
4: If we can actually support a more generic interface, that's great, but most of the workspace association entity field names have a
content_entity_prefix, which led me to believe only content entities were supported. Unless here "content" means "anything that's not a config entity". Anyway, IMO it would be less confusing if we had just anentity_prefix then. Ideally in a follow-up we could also add a test with an entity type implementing only the minimal set of supported interfaces to make sure those actually work.5: Sorry, but this doesn't sound a good enough reason to me: those action names seem generic enough to trigger a conflict, I don't see why we would risk that just to be consistent with Workflows. Actually I would try to also update the Workflows URIs to be namespaced, if it can be done without BC breaks. Not in this issue of course :)
6: Ok, can we then make the PHP doc a bit more generic? E.g.
Triggers clean-up operations after pushing.7: Totally fine with a follow-up, but I was only suggesting to reverse the words: e.g.
CONFLICT_CHANGE_ON_DELETE. Btw, I just noticed that the constants PHP docs are missing the third-person form :)9: 👍
10: Thanks for getting rid of the
$groupparam! The reason why I was suggesting two methods is, as you probably are well-aware, because I joined the no-bool-params camp ;)11: Ha, got it, thanks!
12: 👍
14: See 4 :)
24: Really? I need a refresh then, I'd swear
$element['#access'] = FALSEcauses the whole element to be hidden.25: 👍
26: 👍
27: Ok, but it doesn't check whether it's dealing with a deleted workspace: the violation is added whenever saving a workspace that has existing associations. I manually tested this and it seems that the only reason why this doesn't create problems is that the violation is filtered out because the ID field is not editable.
29: RTFM :D Jokes aside, I swear I read that comment, not sure why it didn't stick...
30: Shame, would you mind opening a ticket about allowing the Entity Query code to work with DI?
31: Ok, but it's not just
count(), it'scount(..., "advanced param I didn't know about until now") - count(...): it feels like something I could get wrong every time, should I have to write it myself ;)38: 👍
39: 👍
41: Ah, sorry, missed that: I was pretty exhausted and craving for some bedtime when I finally got to the test code ;)
43: 👍
Comment #161
plachOne more thing:
This is a weird method name if compared with the return value. I would expect this to be called
::getEntityTrackingWorkspaceIds()or something along those lines.Comment #162
amateescu commentedI'll address #160 quickly :)
3. Fair enough, I removed "queries" from the doc.
4. That
content_entity_prefix has also been bugging me for a while, thanks for insisting on it! I renamed the prefix totarget_entity_to match the entity reference terminology.5. Ok, but we don't really know how to namespace a link relation. It could mean either having a URI like
https://drupal.org/link-relations/workspace/activate-formorhttps://drupal.org/link-relations/workspace-activate-form. And I couldn't find any useful information about this on https://tools.ietf.org/html/rfc5988#section-4.26. Done!
7. Those constants will probably be moved/changed in #2958752: Refactor workspace content replication plugin system to three services, so let's move this discussion point over there :)
10. I know :P But I'm still not convinced by the no-bool-params argument and I think that it would be nice to be consistent with the outcome of the same discussion from #2955442: Add a way to get all the tables in which a field is stored from TableMappingInterface.
27. Good point and thanks for testing! I changed the constraint to only do stuff for newly created workspace entities.
30. Sure thing, here it is: #2968231: Use dependency injection in entity query
31. Ok, added
getNumberOfChangesOnTarget()andgetNumberOfChangesOnSource()toRepositoryHandlerInterface:)Re #161: Yep, I was also looking sideways at the name of this method :) Changed it according to your suggestion.
Comment #163
plachThanks for bearing with me @amateescu, the interdiff looks great!
re #162:
5: IMO, given that the standard does not provide indications, we should decide whether we want to keep underscores in module names. If for consistency we want to covert them to dashes, I think we should go with my original suggestion: a
formrelation defined by theworkspace_activatemodule would conflict withhttps://drupal.org/link-relations/workspace-activate-form, whereas using the slash would provide unambiguous results:https://drupal.org/link-relations/workspace/activate-formvshttps://drupal.org/link-relations/workspace-activate/form.7: 👍
10: Good point
30: Thanks!
31: Future me owes you one (more) beer :)
Comment #164
amateescu commented@Fabianx, re: the biggest pain point from #149 and @catch re: #150
\Drupal\Tests\workspace\Kernel\WorkspaceIntegrationTest::assertEntityRevisionLoad()begs to differ with this assessment :) We absolutely support individual (and multiple) revision loading and do not override them.This has been a work in progress for a few years already. @catch already posted a couple of related issue, but here's the full list:
#2721313: Upgrade path between revisionable / non-revisionable entities
#2820848: Make BlockContent entities publishable
#2880149: Convert taxonomy terms to be revisionable
#2880154: Convert comments to be revisionable
#2880152: Convert custom menu links to be revisionable
#2336597: Convert path aliases to full featured entities
The initial patch had four per-workspace permissions, and this point has been discussed earlier in the issue. We agreed to remove most of them and let individual uses cases derive and implement them if needed. But I think the remaining
bypass entity access workspace $workspace_idone is important and should be kept.$revisionis how we refer to an explicit entity revision everywhere else in the D8 codebase, so I'm not sure what it could be confused with..What we're doing here is basically taking advantage of the optimized revision storage that core provides, which gives us two tables to work with: default revisions and all revisions.
When we update a workspace association entity (i.e. when we create a new revision for it), the latest revision ID of the tracked entity is stored in the default revision table, and the history of the previous workspace revisions is kept in the all revisions table. This allows us to do two important things:
1) when we need to join against the latest workspace-specific revision, for example in
COALESCE()calls, we join the default revision table.2) when we need to delete a workspace, we know all the individual revisions that were tracked in that workspace and that need to be deleted.
This also answers #151.4 (why our
COALESCEcalls work properly) :)Not sure if you're referring to a specific kind of cleanup, but we already support deleting workspaces, and we should encourage site editors to do that once their work is done and a workspace has been published to live.
Re #151:
1. Yes! Entity Query in D8 is a joy to work with :)
2. That @todo about what to do with queries for the latest revision will most likely be fixed by #2942907: Entity system does not provide an API for retrieving an entity variant that is safe for editing.
In fact, I think the entire concept of "latest revision" is wrong and we need to move to a "negotiated revision based on variable context information" way of thinking.
3. Good point! Done.
4. This is answered above. It works because the table we join in the
COALESCE()is the one which stores the latest workspace-specific revision ID (the default revision table of the workspace association entity type).5.
Indeed, a workspace needs to be either published or deleted before an entity that is tracked in any non-default workspace can be editable again.
6.
Nope, you would need to add a workspace negotiator. We actually had a query parameter negotiator in the initial patches but it was taken out in the meantime. Do you think it is useful and that we should add it back?
7.
As @catch mentioned, archiving a workspace is not planned for the core module :)
8.
This not only created on demand, it is a memory-only representation of the default workspace, meaning that it's not saved to a storage anywhere, so there's no chance of conflicting IDs.
9. Yup, that's the live repository handler, which will probably be copied mostly as-is to a separate "workspace publish" service in #2958752: Refactor workspace content replication plugin system to three services.
10. That's very neat, I like it! Opened #2968452: Add a way to execute a function in the context of a specific workspace to discuss a bit more :)
11. Yes, reverting a workspace is something that we need to think about at some point. With the current architecture, the easiest thing to implement would be to only allow reverting the latest workspace that was published. In any case, a full revert solution will require conflict resolution and we're not there yet.
12.
And they solve the render caching problem perfectly. My only wish is that they were made even more generic, so we could use them for other caching as well, e.g. for deriving the cache keys of the persistent entity cache. Right now we have to clear the entire entity cache when we switch workspaces, but we have an issue to improve that: #2924218: Clearing the persistent entity cache every time we switch between workspaces is super wasteful
Re #152:
I think we can fix the potential race condition if we remove the deleted workspace ID entry only when we're sure that there are no more tracked revisions left. This has a small downside that the deleted workspace ID entry will be removed from state only in the next batch run, but I don't think that's a big problem.
Comment #165
catchI can't find it, but Berdir has an issue proposing exactly this.
Comment #166
amateescu commentedNice! I'll ask him and see how I can help there, so we can fix #2924218: Clearing the persistent entity cache every time we switch between workspaces is super wasteful without any other change than that tiny patch :)
The fix I proposed for #152 (delete race condition) doesn't work, so we need to change the logic a bit and check if we have any new workspace association revisions after a purge loop.
Comment #167
catchFound it! #2551419: Abstract RenderCache into a separate service that is capable of cache redirects in a non-render array-specific way.
Comment #168
amateescu commentedAnd last but not least, this interdiff should address #158. Thanks, @lauriii!
As for the accessibility maintainer's review, this patch only adds a minimal UI implementation using core's toolbar, so their attention would be much better spent in the issue which will add the full workspace UI: #2949991: Add workspace UI in top dialog
Comment #169
amateescu commentedPhew, all the reviews have been addressed, so here's a new patch :)
The interdiff is relative to #143.
Comment #170
timmillwoodAwesome work over the last couple of days @amateescu, looks good to me.
Only outstanding item is a change record!
Comment #171
plachI thought we were going to address also #163.5 :)
Comment #172
amateescu commentedHere's the change record: https://www.drupal.org/node/2968491
@plach, tbh, I'm still not sure if namespacing link relations with a slash is possible. The entity annotation would look like
"workspace/activate-form" = "/admin/config/workflow/workspace/manage/{workspace}/activate",?Let's discuss this in a new issue so we can also fix Workflows at the same time.
Comment #173
amateescu commentedI've updated the roadmap issue and listed the two known hard blockers for beta stability as must-have in the IS. However, there are a few more followups that need to be categorizes, see #2732071-19: WI: Workspace module roadmap.
Comment #174
cosmicdreams commentedI love the diagrams provided in the change record. I hope that it will eventually also include mention of any new APIs or existing APIs that have been modified / deprecated.
Comment #175
fabianx commentedI would re-add it in a major follow-up.
The use case is:
For example an advanced workflow (of our content editors) could look like:
So a very useful functionality to switch via URL.
Comment #176
fabianx commentedCan we add least add a setting for that?
I am okay if that is so low-level that we just use a
$settings['workspace.disable_permissions'] = TRUE;Or how would custom / contrib disable that easily?
Ahhh, yes as the workspace associations are also entities that makes perfect sense now :).
They can't ;).
Once a workspace has been published to 'live', all association of entities edited get lost (as the workspace associations are deleted).
But it also means that all in-between entity revisions are kept forever.
e.g. the difference between merging a branch vs. flattening that branch.
So probably the option to flatten "branch" during publishing live is all that is needed to be done in a follow-up?
I think there should be at least a major follow-up for that functionality - as currently the deletion makes any kind of 'undo' impossible.
e.g. I cannot imagine our content editors to work without being able to revert things.
It also is not _that_ difficult to do as all we need is a table with all that information, e.g. even in a serialized format.
Then re-creating the old revision IDs is simple to do by creating a revert workspace:
- Create the workspace 'revert_staging_[some_id]',
- Re-create all the workspace association entities, but associate with old 'live' workspace revision_id
- Workspace can then be published like normally
Only trouble is how to revert several times, but probably can just allow to revert any workspace, which has no conflicts ...
Comment #177
fabianx commentedRTBC + 1 - no more blockers from my side
Comment #178
plach@#172:
I'm not sure how the annotation would be tied to the YAML relation definition, but if I understand the RFC correctly the relation name would be the whole URI, so something like this:
I guess in this scenario
"workspace/activate-form" = "/admin/config/workflow/workspace/manage/{workspace}/activate"would make sense to me.I don’t think this is a blocker for commit, but we should get that sorted before stable, to avoid BC issues. Maybe a "workspace-critical" follow-up?
Comment #179
amateescu commented@cosmicdreams, the nice thing about the current architecture of the module is that it doesn't need any special API nor does it provide any specific integration points. All you need to do is make sure that your custom entity types are revisionable and publishable, and then everything works out automagically :)
Opened a few more followups for #175, #176 and #178:
#2968875: Bring back the query parameter workspace negotiator
#2968861: Add a way to revert (undo) a workspace
#2968850: Figure out use-cases for per-workspace permissions and provide them if needed
#2968856: Consider adding an option to "flatten" the revisions of a workspace before publishing it
#2968830: Figure out how to namespace link relations
Comment #180
catchI've bumped #2968850: Figure out use-cases for per-workspace permissions and provide them if needed to a beta blocker, although to be honest I don't understand the use-case for that permission at all so would be happy to just remove it and start from use cases.
Not sure on the blocking-ness of the other issues but glad to see the follow-ups created.
Comment #181
amateescu commentedOk, let's remove it and use that issue to figure out if we really need any per-workspace permissions or not.
Comment #182
fabianx commentedThanks for all the follow-ups.
Interdiff is RTBC as well.
Comment #183
catchWhy is this also removed?
Comment #184
amateescu commentedBecause it wasn't used anywhere, so I thought it's a good one to discuss whether it's needed or not.
Comment #186
timmillwoodComment #187
amateescu commentedI think those are genuine test fails coming from #2894261: Deprecated service entity.manager needs to be replaced with entity_type.manager in ContentEntityForm.
Comment #188
catchDid my best with commit credit but if I messed it up let me know.
Having worked on this area in contrib in Drupal 7, it's an absolute nightmare and while there's a lot of work to do still in core generally, this module gets us closer to resolving a lot of different inter-related issues.
Committed/pushed to 8.6.x, thanks!!
See you in the follow-ups.
Comment #189
timmillwood@catch Awesome, thank you!!
Can we add commit credit to @jeqq who worked a lot on the original contrib module?
Comment #192
catchSpoke too soon, fixed these on commit:
Comment #193
phenaproximaCongratulations to everyone on landing this beast. Awesome work!
Comment #194
plachAwesome work, guys!
Comment #195
dixon_I'm at loss of words :) Thanks everyone who has been involved in this by designing, building, reviewing, testing, discussing, thinking, asking questions at our DrupalCon session, or really anything!
I'm really proud of our initiative for this big milestone :)
Comment #196
fabianx commentedCongratulations!
Comment #197
webchick*GREAT* WORK!! :D GO TEAM GO!! :D
Comment #198
fabianx commentedLate thought for the party, but I think we should also credit 'merlinofchaos' for the original CPS work that influenced a lot of design decisions of this module.
Comment #201
catchComment #202
jibranDER and Workspace both override EFQ. Right now DER v2 EFQ integration will break if the workspace module will be enabled. Related DER issue is #2835542-11: Utilize service decorator instead of taking over entire EFQ service.
Comment #203
gábor hojtsy