Problem/Motivation

@plach in #2860097-129: Ensure that content translations can be moderated independently:

  • Open another follow-up to finally discuss the definition of pending revision, how different behaviors can be reconciled or at least make sure they don't conflict with each other, and possibly re-enable latest revision in any entity form. Basically continue on the foundations laid by #2860097-126: Ensure that content translations can be moderated independently, which looks like a promising start. Since we were ready to perform this change in between 8.4 and 8.5, it should be ok to perform it in any minor, but I'm not sure we will be able to make that happen in 8.5.

@Sam152 in #2860097-126: Ensure that content translations can be moderated independently:

My question and the source of disagreement is: what are the expectations developers have with regards to how core entity forms behave when a pending revision is created? In the content_moderation use case, ignoring those pending revisions in entity forms is data loss. Are other (current or future) consumers of this api in the same boat or are there concrete use cases out there where this would mean data leaking instead of data being preserved? The entry point into the whole API doesn't strictly define exactly what "pending" or "forward" implies exactly.

Is it helpful to be more or less opinionated about this? Are we preventing headaches for someone with $usecase or making false assumptions which are equally tricky to undo. I think I'm biased towards the former simply based on content_moderation being the "flagship" consumer of this API. I think "latest" and "pending" have some implication towards "this is the point at which things progress" more than "this is a generalised place to store more information". Purely theoretical, but if we nudged integrators towards this paradigm, is there be more scope for things to play nicely together?

The flip side of all of this breaking BC really sucks. My belief is still that this should be treated as fixing a critical bug, furthering the efforts to make the latest revision a first class citizen however @hchonov makes great points that this should be carefully evaluated.

So:

  1. the concept of "pending revision" is open to interpretation, a clear documented definition would help avoid repeating the same discussion on many issues
  2. define how different kinds of forward revisions ("pending revisions" are just one kind) can co-exist, for example #2860097-128: Ensure that content translations can be moderated independently (Actually […])

Proposed resolution

TBD

Remaining tasks

Discuss. Start from #2860097-126: Ensure that content translations can be moderated independently, cited in the IS.

User interface changes

TBD

API changes

TBD

Data model changes

TBD

Comments

Sam152 created an issue. See original summary.

Sam152’s picture

Issue summary: View changes
Sam152’s picture

The major issue I see is, should core take an opinionated stance on which entity revision forms the basis of updates to an item of content? Modules like content_moderation use pending revisions in such a way where it only makes sense to add new content the latest revision. As a result, up until recently it was required within content moderation to alter the behavior of content entity forms to ensure the latest revision was indeed the one being modified.

The argument against moving these semantics to core (and enforcing them) so far has been: what if I'm using pending revisions in such a way where the default revision still makes sense to be starting point for further edits. The example given was an autosave feature, where pending revisions were storage for content which was "uncommitted" in a sense.

With the former approach we're removing the usefulness of pending revisions for a whole class of use cases which might depend on them being more transient than something like a draft content edit. I think a major counter argument to the latter approach is this:

If there is no strictly defined content revision which exists as the starting point for further content additions, our APIs for making such content changes are useless.

One concrete example we have is quickedit. Up until recently quickedit would happily edit the default revision of content, regardless of pending revisions. The CM use-case for pending revisions dictated such changes were reserved for the latest revision. An autosave-like usecase would dictate the opposite.

It's not possible for content moderation (or other users of pending revisions with similar requirements) to attempt to make alterations to every system that updates content (entity forms, quickedit, $some_module), thus for the usefulness of both pending revisions and the general entity/content/field API, I would propose the following: the latest revision is the preferred starting point for all new content revisions. I think systems that don't respect this should have open bug reports for data loss.

I've already been opening issues and working on parts of core which would help this happen, without considering the latter use-case (I truthfully didn't realise it was one that existed), my motivations were to improve pending revisions independent of content_moderation. So hopefully we can get some consensus around what pending revisions should actually be used for and ensure that is clearly documented. I think the benefit to cost of deciding would be huge. I'd optimistically suggest for the latter use case there are other means of storing content that don't specifically involve creating new revisions.

Wim Leers’s picture

Assigned: Unassigned » hchonov

We really need @hchonov to weigh in here, he seems to have the most developed opinion and insight on latest/pending/… revisions :) So, assigning to him!

As far as I can tell, #3 is an excellent summary of the concerns raised by @hchonov already though! Still, would be great if @hchonov could confirm, and if he could add any nuance he considers necessary.

Wim Leers’s picture

Priority: Normal » Major
Issue summary: View changes
Issue tags: +Workflow Initiative
Related issues: +#2860097: Ensure that content translations can be moderated independently

Oops, I created #2940904: Settle on definition of "pending revision" for the same purpose as this. Closed it as a duplicate. But moving its IS here.

Sam152’s picture

Ah, indeed! Thank you for the IS update.

hchonov’s picture

Assigned: hchonov » Unassigned

@Sam152, @Wim Leers thank you both for summarising all the talks we had.

Unfortunately I feel like the only one having a bad feeling about enforcing the content moderation use case as the default use case regarding forward revisions. Almost everyone that currently works on forward revisions in core is involved only because of content moderation. I guess this is the difference and the reason why I have a different opinion - I am not involved with the development of the content moderation module. I think that it would be better to find more people not involved in the development of the content moderation module as well, so that they are not biased by the direction the module has taken.

We should consider use cases where forward revisions might be used and if there are none we could try to find and create them.

Probably we could even submit a core conversation for Drupal Europe 2018 on the subject and see what others think about this. I hope that we reach to more developers with different ideas and point of views. I guess we need something like a "call for papers" on the subject forward revisions :).

One more use case I could think of would be a nested inline entity form where each inline entity form might be saved separately into a forward revision and the forward revision is allowed to become an official/default revision only when the main entity form is submitted.

In the content_moderation use case, ignoring those pending revisions in entity forms is data loss.

Why? Could you please elaborate on this? What if there are two users and the one started a draft for rewriting the document body and the other one simply wants to edit the default revision to fix a typo and for this the default and not the forward revision is needed? Could someone please provide more details about the exact problems content moderation faces and how exactly it works?

I have further questions regarding drafts - what if I have two different entity form displays for the same entity type configured with different fields and one common field - EFD.A and EFD.B. Then I have user group UG.A allowed to edit EFD.A and user group UG.B allowed to edit EFD.B UG.B starts editing an entity and edits only the fields that it has access to and doesn't touch the common field. Now before UG.B publishes the draft as a default revision suddenly there comes someone from UG.A and starts working on the common field by using the other form display - EFD.A. So how could UG.B now publish the draft as default revision but without the changes on the common field made by the other user group? I am not sure if content moderation has considered this use case, but I think it shows pretty good the drawback of always using latest revision in the entity form API.

Sam152’s picture

I still don't know how we're going to address:

If there is no strictly defined content revision which exists as the starting point for further content additions, our APIs for making such content changes are useless.

IMO, if a stance isn't taken here, we lose a lot more than potential use cases that haven't been conceived of. I could see some scope for some getPreferedMutableRevision method trying to smooth over the bumps here, but I'd ask is it really worth it? Making all the APIs wrt translations, pending revisions, non-revisionable fields seems to be a quite a big task to begin with and is still in progress, I don't see how it'd ever be possible to make content_moderation responsible for switching such semantics on and off, across every subsystem or module that might want to update an entity.

I also think content_moderation is the perfect use case to inform discussions like this, because it has had to have the most far reaching compatibility with other core subsystems and has uncovered existential questions like this. Every future user of the API will have to content with such problems on their own, or efforts can be made to keep a stricter model of pending revisions agnostic of content moderation.

We both understand the arguments in either camp, so I think we've simply reached a disagreement :). Interested to hear thoughts from some others with experience in the domain.

hchonov’s picture

I still don't know how we're going to address:

If there is no strictly defined content revision which exists as the starting point for further content additions, our APIs for making such content changes are useless.

Regarding this I have asked a question in my previous comment, which I think is pretty important:

In the content_moderation use case, ignoring those pending revisions in entity forms is data loss.

Why? Could you please elaborate on this? What if there are two users and the one started a draft for rewriting the document body and the other one simply wants to edit the default revision to fix a typo and for this the default and not the forward revision is needed? Could someone please provide more details about the exact problems content moderation faces and how exactly it works?

I have further questions regarding drafts - what if I have two different entity form displays for the same entity type configured with different fields and one common field - EFD.A and EFD.B. Then I have user group UG.A allowed to edit EFD.A and user group UG.B allowed to edit EFD.B UG.B starts editing an entity and edits only the fields that it has access to and doesn't touch the common field. Now before UG.B publishes the draft as a default revision suddenly there comes someone from UG.A and starts working on the common field by using the other form display - EFD.A. So how could UG.B now publish the draft as default revision but without the changes on the common field made by the other user group? I am not sure if content moderation has considered this use case, but I think it shows pretty good the drawback of always using latest revision in the entity form API.

Sam152’s picture

If content is created in a pending revision, when the default revision is loaded, changed and saved it all of the data created between the default and latest pending revision is gone. The revision based on the default, with the changes applied is now the latest and default, with the original pending revisions no longer being pending.

hchonov’s picture

If content is created in a pending revision, when the default revision is loaded, changed and saved it all of the data created between the default and latest pending revision is gone. The revision based on the default, with the changes applied is now the latest and default, with the original pending revisions no longer being pending.

This sounds like something that could be fixed by a proper conflict resolution without any changes to core for this. The conflict module already solves the problem where a user is currently working on a cached form of an entity which in the meanwhile got a newer default revision by another user. It even takes care of auto merging non-edited fields and fields with no edit access.

What about the example with the two form displays and two user groups?

Sam152’s picture

I don't think conflict solves this. Consider the case of quick edit, we need to know what revision to load from as well as save to.

hchonov’s picture

Consider the case of quick edit, we need to know what revision to load from as well as save to.

I would appreciate some clarification. You mean the problem is that quick edit will edit the default revision and create a new default one? CM could catch up and and when a new default revision is created, it could recreate the forward revisions or it could use a special field to search for the forward revisions which are still to be considered forward even if they are behind the default revision. What happens then is that you get a newer default revision than the draft one and later when you want to save the draft you'll need the conflict module to resolve the conflicts that have been cased meanwhile. So what is the problem with such an approach?

tim.plunkett’s picture

See also #2942907: Entity system does not provide an API for retrieving an entity variant that is safe for editing. Please expand that as you see fit, but there needs to be a clear API for other code to use at the end of all this.

hchonov’s picture

Unfortunately there is one issue that has made its way in - #2878556: Ensure that changes to untranslatable fields affect only one translation in pending revisions, which basically enforces the CM behavior before we've decided the way to go in the issue here. If we make the changes of the referenced issue public then the core way of using forward revisions will be occupied and enforced by CM.

catch’s picture

Just discussed this with plach a bit, going to try to summarise where we got to.

First of all, some propositions:

1. Default revisions are always sequential

2. Pending revisions may either be sequential or parallel

3. When a pending revision is saved as a new default revision, it either represents moving the default revision along a sequence (vid 6 default, 7 pending, 8 pending, 9 pending, 10 default) or will leave non-sequential revisions still pending.

4. To reconcile non-sequential revisions, conflict management must be available to avoid unintended data loss.

I'd really like us to stop using 'the content moderation workflow' vs. 'autosave workflow', and talk about sequential vs. non-sequential revisions. Autosave and content moderation are end-user features as opposed to distinct API use cases.

If we look at Slack's post functionality, it allows multiple users to edit a post, with autosave, but only sequentially:

User A opens a post and shares it. While they are editing, other users can view the post and comment on it (with auto-refresh of the content), but not edit it - they get a message 'catch is editing...' and the edit button disabled.

When User 1 stops editing (by closing the window or clicking 'done editing') User B or C can now start editing the post.

This is a use-case for autosave, but it does not in any way suggest or require non-sequential revisions or conflict management - probably because Slack didn't want to deal with that.

A simple use-case that could lead to non-sequential revisions is fixing a typo in the default revision while there is a larger edit in a pending draft. If there is no conflict revolution, the larger edit could undo the typo fix once it's saved.

So my own view on this is that we should by default always present the latest revision to users when editing, and prevent non-sequential revisions (with some possible modifications for translations).

Conflict module, or modules depending on it, should be able to undo that behaviour however it's enforced, but I think it's reasonable to expect them to explicitly disable something that's preventing data-loss - and which applies to any API use-case for pending revisions - at the end, you're going to have to apply the changes to sequential default revisions, regardless of how they were created, and this always means one of locking/validation, conflict resolution, or a free-for-all.

amateescu’s picture

So my own view on this is that we should by default always present the latest revision to users when editing, and prevent non-sequential revisions (with some possible modifications for translations).

The main problem with preventing non-sequential revisions is that it makes all existing use-cases for pending revisions inherently incompatible with each other. Let's take a very simple example:

A site enables workspace support for an entity type, which means that:
a) it can not enable Content Moderation on the same entity type because the workspace would need to also track the publishing state
b) it can not enable Autosave on the same entity type because the workspace would need to also track per-user changes

The conditions above can be interchanged freely, but the end result will be the same: each module working with pending revisions needs to be aware and integrate with *every* other module that also works with pending revisions.

Additionally, without non-sequential revisions (revision branches) and a proper revision merging strategy (conflict resolution), many basic use-cases are left out in the cold:

- as @catch noted above: fixing a typo in the default revision while there is a larger edit in a pending draft. If there is no conflict resolution, the larger edit could undo the typo fix once it's saved.
- workspaces: the same entity can not be edited in two workspaces at the same time
- autosave: the same entity can not be edited by two users at the same time

I haven't followed the latest developments in Content Moderation very closely, but I think it has similar limitations regarding translations.

After thinking about this topic in the past few days, I think that CM is the only one that can work just with sequential revisions (again, without knowing very well how it handles translation conflicts), but workspaces and autosave need revision branches and conflict resolution if we want to fully unlock their potential, because they have to determine the correct revision to load/edit based on different contexts (per-workspace and per-user respectively).

As for always presenting the latest revision when editing, I have to disagree on that. At least for workspaces, it creates an additional "burden" of having to alter latest revision interrogations (entity load, entity queries, views queries) alongside default revision ones. I assume autosave would have similar problems to overcome.

Therefore, for now I think that core should leave it up to each module that exposes pending revisions in the UI to be responsible for determining which revision should be used on load/edit.

catch’s picture

b) it can not enable Autosave on the same entity type because the workspace would need to also track per-user changes

This assumes that you don't autosave a new revision as the current draft each time. You could have a new draft in moderation, autosave new revisions on the node form, content moderation takes the last of those revisions for moderation and publishing. Same as editing without autosave really.

What complicates it is per-user autosave, but that already presupposes conflict resolution with autosave just on its own.

It's the same with workspaces - with just workspaces, you either limit to sequential edits, or you don't and have to deal with conflict resolution.

Sam152’s picture

In the workspaces model, all queries are already qualified with the active workspace right? Isn't the model sequential revisions, just sequential per workspace? Ie, for all pending revisions created in any workspace, we do still want the latest revision for editing, just the latest with an additional qualifier of workspace ID.

amateescu’s picture

@catch, @Sam152, yes, sequential edits work fine individually for each use case (with various limitations due to missing conflict resolution), but the point that I was trying to make is that non-sequential edits (revision branches) would allow all of them to work together without having to know about each other :)

plach’s picture

I think we should eventually bring a conflict management solution in core and unlock the potential of parallel revisioning. However this is not going to happen in 8.5. OTOH 8.5.0 will be the first release shipping a (stable) core feature that supports pending revisions, so it should somehow set a direction on this matter.

I believe it is critical that we find a way to support seamless parallel revisioning in contrib, at least until we are ready to do this in core (8.6? 8.7?), which will likely require #2942907: Entity system does not provide an API for retrieving an entity variant that is safe for editing and possibly tracking the revision owner (see the discussions in #2891215: Add a way to track whether a revision was default when originally created).

It would be great if we could do #2942907: Entity system does not provide an API for retrieving an entity variant that is safe for editing before releasing 8.5, but it will be badly needed in core alone for sure in 8.6, when Layout Builder will be (hopefully) stable and Workspaces at very least beta.

All that said, I think it is reasonable to limit core to sequential revisioning until it is ready to support conflict management, because there's really no other way to properly handle the data issues that would otherwise arise.

@amateescu:

Do you think that a centralized API allowing to negotiate the revision to be edited, would allow a seamless integration among different modules supporting pending revisions for different use cases? Can we come up with examples of how we envision at least core modules working together? I think formally stating which use cases we have in mind would help in designing this API. Of course we don't want to limit ourselves to those only, but at least we would all be on the same page. After talking with EclipseGC, I believe that we should expect at least the following core modules to have to play well together: Quick Edit, Layout Builder, Content Moderation and Workspaces. And Content Translation will need to keep working on top of that :)


Aside from that, since this issue is aiming to define the concept of pending revision itself, I think we should also clarify our terminology once and for all: if I'm not mistaken, so far we have been defining pending revision as a revision saved with non-default state. However pending conveys the idea that something still needs to be done with that revision. Is a revision obsoleted by a subsequent non-default revision still pending? Or should it be called something else? Or should we reinforce the idea that pending in our book literally means non-default, i.e. any change that didn't make it into a default revision is labelled as pending, regardless of its freshness? If pending is only the latest (translation-affecting) non-default revision having a default revision ancestor, then parallel revisioning has the potential to disrupt this definition, or at least to require concepts that we didn't define yet. OTOH if we stick with the simple definition (pending = non-default), then any non-default revision is pending regardless of its role/position in the revision graph.

plach’s picture

Sam152’s picture

@amateescu, thanks for clarifying, that makes sense. #21 also sounds like a great summary of the current state of things for the real-world problems we have today.

All that said, I think it is reasonable to limit core to sequential revisioning until it is ready to support conflict management, because there's really no other way to properly handle the data issues that would otherwise arise.

I think this is my biggest reservation for adding an API for centrally negotiating the edit-revision. If we provide a mechanism for developers to change the edit-revision to some arbitrary revision in the tree, are we able to make any guarantees with regards to integrity of data across translations or other modules? Does it become part of the API surface in core and a potential source headaches if we're supporting any revision exist as the edit-revision. Do we know how things behave if the edit-revision is a black box? What do we do when integrity bugs are logged when the edit-revision was set to something completely unexpected?

IMO the solution which provides the most flexibility as future APIs evolve and things like conflict resolution start to appear is a convention to use existing methods like ContentEntityStorageBase::getLatestRevisionId. Down the track if a centralised API and negotiation system is the solution, modules can transition to using that. If it turns out we have solid conflict resolution and modules have the option of loading the default or latest revision (or something else), then modules can transition to that.

If we're arguing conflict resolution is the answer to compatibility of different systems loading different revisions for editing, isn't the concept of a "safe revision" kind of moot at that point?

plach’s picture

A (very long) call was held today among some interested folks, the full notes are available here:

https://docs.google.com/document/d/1ku6y-eYkUQXGlstVDoFdWOyHhpYQ6FA35rdC...

A very short summary is: nobody is completely happy about the current status of 8.5.x, but we all agree that it is shippable as long as we commit to do #2942907: Entity system does not provide an API for retrieving an entity variant that is safe for editing in 8.6. The end goal is having an API allowing to seamlessly switch between sequential and parallel revisioning and making it easy for multiple modules exploiting the latter to play well together. The constraints currently active in core would be then only activated when sequential processing is active. Parallel revisioning would be probably be allowed only when a conflict management solution is available, and even then it might be possible to disable it at runtime.

On the terminology side, everybody was onboard with calling active the right revision to edit for a specific context, so this probably addresses my questions in #21: a pending revision is any non-default revision, an active revision is the one that is supposed to be loaded in the entity form for the current context, and it may be pending or default.

Sam152’s picture

Thanks for the summary @plach and everyone else who joined the call.

With #2942907: Entity system does not provide an API for retrieving an entity variant that is safe for editing targeting 8.6, I've tentatively rejigged #2942675 to support the currently sequential-only use case 8.5 will ship with: #2942675: Layout builder should use the active variant of an entity to avoid orphaned revisions.

Fabianx’s picture

Very late to the party, but 90% of the problems stem from the implementation of content moderation and are not affecting workspaces:

The main problems with layout builder and CM are from content moderation and would not happen with workspaces.

e.g. If the draft state of CM in reality is a micro-workspace containing just one entity => it just works

Just because CM relies on forward revisions not the default one to do so, creates those problems as 'draft' is never stored as a property on the entity when saving.

But not a problem in workspaces, so this is 100% Content moderation restriction (e.g. drafty in D7 also solves this) and there is really no need for a complex conflict resolution process (which helps for other things, but that is unrelated).

The main thing is to start thinking in workspaces and think of content moderation as single-entity workspace (not saying it needs to be based on that, but saying that conceptually it is superior):

- If layout builder loads an entity which is in the 'draft:[entity_id]' workspace, then it will also be saved in that workspace => no more problems with the new revision.

- If layout builder loads an entity, which is in the 'live:[entity_id]' workspace, then it will also be saved as live revision => no problems.

(Yes there is still problems with conflict resolution, but those are the same as before and as workspaces do not allow that right now, you could not edit an entity while a pending draft is there).

Also those issues are pre-existing with CM right now anyway ...

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

RoSk0’s picture

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.