Problem/Motivation
The ability to save forward revisions was added in #218755: Support revisions in different states and—while it is not utilized by Core's UI—contrib modules are now able to save new draft revisions without unpublishing the currently published revision, or implementing unwieldy hacks. However, in the discussion of #2429153: On node revision overview, use 'Set current' if revisions are newer than current version it was indicated that there is currently no officially supported mechanism in Core to set a forward revision to be the current revision. This seems like a major oversight.
The expectation for any module implementing content moderation is that a forward revision can be simply set as the default (a.k.a. "current") revision, without creating a new revision, or changing the content of said revision. It seems like this should be possible by loading a forward revision and saving the entity, setting $newRevision = FALSE and $isDefaultRevision = TRUE. However, this method does not appear to be documented, or included in any tests, so it's function and support is questionable.
Proposed resolution
Either confirm that the above method is the official way to set a forward revision as the default revision and create tests to enforce that behavior, or provide an alternate method to set a forward revision as the default revision. However it is accomplished, this should update the base tables for the node and its associated fields to reflect the more current revision, without creating a new revision.
Remaining tasks
Do all the things.
User interface changes
Core does not currently utilize forward revision functionality, so no changes to the UI are anticipated. However, it may impact how #2429153: On node revision overview, use 'Set current' if revisions are newer than current version is implemented.
API changes
New tests are needed, but at this time no changes to the underlying API are anticipated.
| Comment | File | Size | Author |
|---|---|---|---|
| #25 | Selection_105_0.png | 24.35 KB | timmillwood |
| #20 | Revisions_2477419_20161215_D34dMan_V1.0.png | 41.38 KB | d34dman |
Comments
Comment #1
Jaesin commentedOne problem is that there isn't a good way to log a revision switch without creating a new revision entry in the node_revision table.
Comment #2
jstollerWe don't track state changes like this anyway, so that shouldn't be an issue. The revision log would stay the same. We just need a way to change the publication status from 0 to 1 and update the base entity table.
Comment #3
fabianx commented#2: Yes and indeed that is the big problem - "just update the base entity table" ...
Maybe it indeed works in D8 out of the box, but in D7 it was a huge hassle hence why drafty settled on always creating a new revision when publishing.
Comment #4
Jaesin commentedHello @jstroller.
I have seen all of the work you have done over the years to help get a reasonable base in core for managing revisions. I appreciate that and the topic has attracted my curiosity.
As you know, historically, the convention has been that each update to the base entity table's vid is accompanied by a new revision (that is identical to the revision being promoted), a new revision id and a new revision log message. Allot of people like it this way because it preserves revision history.
It would be possibile to create a new operation (Action Button) that follows the convention of preserving history and also sets the new revision as published. There is nothing in core to prohibit this from happening in contrib that I know of.
Comment #5
jstollerPreserving revision history is exactly what I'm trying to protect here. But in my mind, it's the history of the content that is paramount here. Forgetting how the database tables are structured, for the moment, publication status is ultimately no more than a flag applied to the content indicating a state. It is not content in and of itself. Copying old content to a new revision just to change its publication state will break the revision history.
To illustrate this, let me explain a real use case I am presently trying to implement in D7. I have content moderation, using Workbench Moderation, so every time I save a node it creates a new forward revision and I have several states content can go through as part of an approval process. My plan is to implement publication scheduling, so editors can schedule the publication of content at a specific date and time. So imagine I draft some changes to a node that I need published on Saturday morning, for a special event. Let's say I'm on revision #15 when my changes are reviewed and marked "approved" by an authorized content manager. Now I start drafting the changes that need to be published the following week, after the event is over. Lets say when I leave work on Friday I'm up to revision #20. Saturday morning revision #15 is published, as scheduled. If the system acts as you propose, it would copy revision #15 and create a new, published, revision #21. But this completely destroys the revision history of my content development. Now when I come into work Monday morning and go to edit this node, I'll be presented with old content, loosing the changes I made in my last five edits. I would be forced to manually revert to revision #20, making even more of a mess of my revision history.
What I propose should happen in this situation is that my revision #15 should be published, as is, without creating any new revisions, thus maintaining the revision history of my content changes. Drupal should note the state change in watchdog, but drupal core doesn't really track state changes anyway, at this point. My expectation is that whatever contrib module I'm using to manage content moderation will have a mechanism in place to track state changes.
Comment #8
d34dman commented@jstoller, i have faced the problem you had mentioned in your comment #5.
We partially solved it by asking the content editors to clone the node in case they want simultaneously work on another version of the node, once they push the decide to publish a version (at later instance...).
ie. you would be working on a new node instead of revision 16.
Comment #9
timmillwoodThis is even more of an issue now Content Moderation module is in core as it can be done via the UI. The problem is then more prevalent when working with content translation too.
Comment #10
haasontwerp commentedStruggling with this has well - implemented Workbench moderation, so new revisions have a status, which is very nice. Described workflow above is exactly as i expected this to work - content editor makes new, unpublished revision (either draft or review), content manager gets an overview of content which latest revision has "Needs review" state, but there is currently no way to set one of the newer revisions as the current revision or published node.
Comment #11
catchThe solution in #5 causes a lot of problems in practice.
When you publish a forward revision, you're making a change from x default revision to y default revision. Modules implement hook_entity_presave() and hook_entity_update() to react to these changes. #2833084: $entity->original doesn't adequately address intentions when saving a revision explains the many ways this can go wrong currently. One concrete example would be the taxonomy_index table in core which only tracks default revisions - if you don't save the entity with the correct before/after state, then this table will get out of sync.
The problem with the forked/branch' entity in #5 is that it creates two histories - the history of forward edits vs. the history of default revision changes. We shouldn't sacrifice the history of default revisions to keep a clean history of forward revisions - the fact the default revision was changed from x to y on Saturday morning is itself useful metadata.
That's better solved by the concept of workspaces so you can track particular sets of changes to entities together, then it'll at least be clearer what the history is. Also storing which revisions are 'default' vs. 'draft' (i.e. we should stop talking about forward revisions at all).
Comment #12
jstoller@catch I would argue that we need to better separate the concepts of saving content and changing the state of content. When I make a change from x default revision to y default revision I'm not making any change to the content—I'm just elevating the status of revision y—so this shouldn't trigger any presave/save/update hooks. If the {taxonomy_index} table needs to track default revisions, then it should implement a hook to react to state changes, in addition to content saves.
As for tracking the history of default revisions, I agree that is important, but you shouldn't have to save extra revisions to do it. I don't want to end up with five different revisions, all holding identical content, just because that content went through a bunch of state changes. Aside from the fact that this would create massive amounts of unnecessary bloat in the database, it also makes tracking the life-cycle of an entity far more difficult to do.
On my D7 sites, all state transitions are tracked in the {workbench_moderation_node_history} table, cleanly separating state change events from content change events (well, D7 has other issues, but I digress). When I go to the content moderation tab for a node, I can clearly see all the content changes AND all the state changes of each revision of that content. Using that same data model, I could create a number of different views into the history of an entity, depending on what I wanted to show. So, if I wanted to see a history of default revision changes by date, that would be no problem.
Comment #13
catchThe content is (usually) displayed on the site. That content changes when the default revision changes. So changing the default revision is the *most* change possible to the content in terms of what's visible, much more than making a new draft revision is.
Comment #14
jstoller@catch: I stand by my assertion that changing the default revision is not changing content. It is changing which version of existing content is visible to the public. However, that hardly seems relevant to this issue. The public isn't looking at the revision history of nodes and doesn't really care if/how revisions are created and stored. This is purely an issue for the back-end interface, affecting admins and editors. I still fail to see what is gained by saving new revisions on state changes.
Comment #15
d34dman commented@catch yes, valid points indeed.
Drupal architecture assumes linear revision history for nodes. And trying to force a forked or non linear workflow on revisions can't be solved easily. Thats why we sacrificed revision history, by cloning the node (and its content), when editors wanted to fork.
I feel, if forking is allowed then, arguments regarding state vs content doesn't matter. So the question is, what could be done to have content revisions support branches?
Comment #16
catch@D34dMan there's #2786133: WI: Phase B: Extend the revision API with support for parents which deals with that issue.
Comment #17
jstoller@D34dMan: A non-linear workflow is exactly what I'm arguing against. Newer revisions should always contain newer content, so I can trace the evolution of an entity in a straight line. Which revision happens to be the one that the public sees is a completely different question. Tracking changing default revisions is also very important, but I've yet to see any convincing arguments as to why we must break the continuity of tracking content changes to do so. If anyone can present such an argument, I'd love to see it. What am I missing?
I'll add that this is an issue with or with out support for forking revisions. I would expect the revisions along each branch to likewise trace the linear evolution of content in that branch.
Think of it kind of like a git branch, except instead of one pointer to HEAD, you have two pointers: DEFAULT and CURRENT. CURRENT always points to the most current commit (aka. revision), while DEFAULT can point to any commit along the branch. New content changes are always based on CURRENT and when they are committed that pointer advances. Moving DEFAULT doesn't make any changes to existing commits, or make a new commit. It just advances the pointer to a new spot on the branch.
Comment #18
johnpitcairn commentedFWIW I agree with @jstoller
I see no reason why the default/current/published (whatever you want to call it) revision needs to be anything other than a specified revision in the history. It should be just a pointer that can be changed without creating a new revision. If that's not the case, that seems to me to be an architectural flaw.
The "edit" local task should always default to editing the latest revision, and creates a new non-default revision from that. If there is some confusion about what will be edited, then that is a naming issue (edit latest? edit draft?). It should not be possible to edit an old revision, there should be no branches. Linear workflow, let's not confuse users.
Reverting to an older revision would clone that revision and place it on the top of the revision stack, without changing the default/current/published revision pointer. Then "edit" will edit that reverted revision.
Reverting publicly would be a multi-step process, first revert, edit if necessary, then "publish" (change the default/current/published revision pointer to the latest reverted revision).
The revision history should be a list of state changes, not a list of revisions. I think perhaps that's where we are having conceptual trouble? Presenting it as a list of revisions forces us into cloning a revision just to "publish" it and capture that in the history, when that should be only a state change?
Comment #19
johnpitcairn commentedSorry, excessively flippant.Comment #20
d34dman commented@jstoller, i think we both want similar user experience.
referring to the diagram in this comment. It seems like we want a user experience for editors, as shown by "teal" colored boxes. I can understand how discarding "da" and "daa" revisions would make it simpler to achieve the required UX.
However, i still think, we can achieve the user experience when branching or forking could be supported. Added advantage would be, the behaviour will be essentially same if the editor chooses to work with an older revision of the Node.
---
We can also improve the revision log presented to user for the time being, if we show version from which it was derived from.
Comment #21
fabianx commented#13: Technical question:
If I load a revision and set newRevision => FALSE, defaultRevision => TRUE:
What will happen?
Would that not essentially publishing the loaded revision without further changing the revisions table?
I think the main problem is that we would have no way to know which is the published revision, right, because right now it is the top-most one, which is 'published'.
I think the new states for workflow will indeed help with that.
Comment #22
charles belovActually, there are use cases where branching/editing from older version would be desirable.
An example would be a page concerning an annual event whose content might change only slightly from one year's event to the next year's, but that we won't know for certain how much it is changing until it is time for the next event.
Once the event happens in year one, we want to replace the content removing last year's content with a notice that the event has already happened and that the information is not available for the next year.
When the information becomes available for the next year, we would want to build that new revision from the older revision which has last year's details, not from the revision that says the information is not available.
Another use case is where someone has spent some time working on change to an already-published page, but the new draft hasn't been approved yet. A temporary change comes through that needs to be published right away, but it will be discarded once the new draft is approved.
Edge cases to be sure, but they have happened in real life.
Comment #23
d34dman commented@Fabianx Did some manual testing,
Test:
Let say a Node with id '1' has revision ids '1' through '10' in chronological order, with only the 10th revision as published.
1. User creates revisions 11, 12 and 13
2. User runs the following script
Observation:
1. node_load(1) loads revision 5.
2. Visiting node/1 shows revision 5.
3. Editing node/1 shows revision 5.
4. visiting node/1/revisions will show revision 5 has canonical link
5. visiting node/1/revisions will show revision 5 is set to "current revision"
6. visiting node/1/revisions will show all revisions greater than 5 to have "Set as current revision" action available.
7. visiting node/1/revisions will show all revisions less than 5 to have "revert" action available.
Also setting status of revision 5 to TRUE or FALSE has no consequence on the behaviour. (EDIT: the node would appear as published/unpublished accordingly, but it will always load revision 5... and not some other published revision)
Neither do publish status of nodes with revision id greater than 5 has any consequence on the behaviour. (i didn't expect this to be honest).
Needs testing:
A casual code reading doesn't reveal any effect on taxonomy reindexing and node grants rebuild since both of them wipe out old data related to node and recreate new entries (against revision 5). But needs testing nonetheless.
Search reindexing accepts Node id as parameter (instead of current node object), so this part of the code has to be checked if it works as expected.
Comment #24
timmillwoodWe need non-liner revision history for content replication / deployment to work correctly.
Lets take the example we're trying to get into core, workspaces. Where on one site you have three workspaces, dev, stage, and live. If the same entity is edited on dev and stage we need to know which revision the change stemmed from so when the replication takes place we can resolve any conflicts.
This results in a revision history like this:

This example is taken from http://rakeshverma.in/final-submission. Rakesh worked with us on a Google summer of code project to build a library for resolving merge conflicts.
Ideally it'd result in a graph, not a tree, where a merged revision is created stemming from two parent revisions.
Comment #25
timmillwoodComment #26
miro_dietiker@timmillwood Love that the idea of the tree is coming up again. I did an analysis of revisions and limitations back in 2012 and i we caught up with some problems outlined in the post:
http://www.md-systems.ch/en/blog/techblog/2012/06/16/drupal-8-multilingu...
I back then discussed with people about the need for a branching revision model. It could start as simple as each revision knowing its predecessor, but people felt this is too crazy and Drupal 8 Core could never cover this model. Thus i steered into the minimum need for a revision / graph per language. (And the revision tab is still confusing in multilingual revision workflows.)
Comment #27
catch@Fabianx #21 - if you published an older revision like that, by overwriting the previous revision default, you'd then make it impossible to revert back to what was the previous default.
Comment #28
jstoller@timmillwood I think a non-linear revision history, with branching and merging, is great. BUT when tracing along any individual branch there should be a linear progression with newer revisions always containing newer content. Again, I think git is a reasonably good analogy for this. Also, state transitions need to be separated from content updates.
Comment #32
dpiComment #33
berdirBased on recent work that we did with e.g. CotentEntityStorage::createRevision(), which enforces a new revision when "publishing" a pending revision, @plach and I think this should probably be closed as either outdated or won't fix as it. Anyone against doing so?
Comment #44
smustgrave commentedSince there was no followup from #33 opting to close as outdated.