Problem/Motivation

Please check the attached screen recording which demonstrates the issue.

On the recording you can see that initially, I had the following tree structure represented by documentation nodes, linked together with a single value ERH field:

  • Parent
    • 1. page
      • 1.1 page
    • 2. page

After that, I created a new revision from 2. page and changed its parent from Parent to 1.1 page and renamed it to 2. page (Parent to 1.1 page):

  • Parent
    • 1. page
      • 1.1 page
        • 2. page (Parent to 1.1 page)

What I was expected to see after that on the UI is the same structure as the tree describes above, rather than what I saw is:

1. When I checked Parent page's children I still could see 2. page there but with its new title. (Always the latest entity (content) revision is loaded on the current UI.)
2. Moreover, I could also see 2. page with its new title among 1.1 page children.

This issue is also caused by how this module tries to support entity revisions in nested sets table. Currently, a nested set table's revision_id field stores the entity revision which created that partial tree. Partial tree, because when an item's parent changes only the affected subtree is rebuilt rather than the whole tree, that is why with the current architectural it would be almost impossible to load the correct child tree after 2. page's parent changed to 1. 1 page.

Proposed resolution

I think nested set tables should have an additional tree_id field as well. So a nested set table would have the following fields: id, revision_id, tree_id, left_pos, right_pos, depth. Tree_id would be a foreign key of a new table, called nested_set_{FIELD_NAME_}tree, which would store unique (auto incremental) tree ids. tree_id should be stored in entity hierarchy field tables as well, so an ERH field's table should have target_id, weight and tree_id fields beside the usual ones.

Let's got back to the initial example and check how proposed solution would solve the problem:

  • Parent
    • 1. page
      • 1.1. page
    • 2. page

Mark this structure with TREE-0 tree id. When the first Parent node is created without any parent reference selected it creates a new tree, called TREE-0. The tree id is generated in nested_set_{FIELD_NAME_}tree table and stored in Parent node's ERH field's tree_id value. When 1. page is added to Parent as a child the system can detect that this node is a child and therefore 1. page inherits its parent's (or we can see root node's) tree_id, so its ERH field's tree_id is also TREE-0 according to this example.

  • Parent
    • 1. page
      • 1.1. page
        • 2. page (Parent to 1.1 page)

This change basically creates a completely new tree, called TREE-1. Which means that all entities in the tree should have a new entity revision created and the ERH field's value on new revisions should be updated to the proper tree_id (TREE-1 in this case). This is important because by storing tree_ids in entities' ERH field revisions proper entity revisions can be identified and loaded when an entity's children are listed on the UI (compared with the current implementation, where revision_id is not able to provide the correct information). This is just a minor modification, for example NestedSetInterface::findDescendants() function also needs to add a new tree_id condition to load all entries from nested set table which belongs to the same tree (version).

Entity hierarchy also makes it possible to achieve this structure by moving Parent under 2. page

  • 1. page
    • 1.1. page
  • 2. page
    • Parent

What do we get? Basically two independent trees from TREE-0. TREE-1 = {1. page, 1.1 page} and TREE-2 = {2. page, Parent}. This is much complex situation than the first one.

Remaining tasks

  • Rebuilding a complete tree when a tree item gets a new revision or reverted to a previous revision (so anything which changes the tree structure) could be really time-consuming, because not just nested tree values need to be recalculated, but also new entity revisions need to be created. Should it happen in batch?

User interface changes

Probably.

API changes

Described in the Proposed resolution section.

Data model changes

Described in the Proposed resolution section.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mxr576 created an issue. See original summary.

mxr576’s picture

Title: Entity hiearchy does not respect entity revisions » Entity hierarchy does not respect entity revisions
larowlan’s picture

I think the answer here is, yes - this is by design.

The nested set representation is primarily concerned with the default revisions.

Whilst this means you cannot view forward revision trees or backward revision trees, I'm not too concerned about that.

But I need to fully digest what you're proposing and the video before I comment fully.

There is logic to move sub-trees when an entity revision becomes the default, but it may not be working as expected.

I will try and write a kernel test to describe your scenario, as this will help my understanding (feel free to beat me to it).

larowlan’s picture

another factor here, we use ER and not ERR field, so the parent can only ever refer to one revision, not a specific revision, which I think supports my case that this can only work for the default revision and not for any revision at given point in time.

larowlan’s picture

I should point out, I'm not averse to this change, just that I doubt I'll have capacity to work on it.

It would require changes at three levels as follows:

  • We'd need to subclass entity_reference_revisions field instead
  • We'd need to update the nested set library to include the tree ID (note the tree ID could be the revision ID of the root drupal entity)
  • We'd need to update the integration code and tests
larowlan’s picture

I should note that whilst the linked detail includes a tree ID, we use one table per tree here, so perhaps we might not need changes to the library, although it would make views integration tricky

acbramley’s picture

Component: Bug report » Code (module)
Status: Active » Needs review
FileSize
1.07 KB

I think this failing test demonstrates what's going wrong here. The original parent still has reference to the old revision of the child, so on the Children tab of the original parent you still see the child.

Status: Needs review » Needs work

The last submitted patch, 7: 2863284-failing-test-7.patch, failed testing. View results

larowlan’s picture

That is the intended behaviour of the tree, but if it is showing on the children tab, then yeah - we need to filter out non default revisions from that page.

The idea being that if you have a forward revision (draft) that has a new parent, you should still find the default revision (published) against the old parent.

RobbM’s picture

The idea being that if you have a forward revision (draft) that has a new parent, you should still find the default revision (published) against the old parent.

That sounds sensible, and a scenario whereby users would expect to see a node appear twice in the tree. (Hopefully, with the titles of the corresponding revisions, rather than necessarily the latest revision).

However, if I understand correctly, the video shows an example where the forward revision has been published, so we'd expect the previous revision to be unpublished and removed from the tree. (Presumably that's why you've suggested a filter for non-default revisions.)

Would I be right in thinking that a workaround would be to avoid creating a new revision at the same time as changing a parent? (I'm inferring that that's what's caused the "logic to move sub-trees when an entity revision becomes the default" to produce this unexpected result.)

mxr576’s picture

@larowlan Have you made any progress regarding this topic?

larowlan’s picture

No haven't sorry

mxr576’s picture

Wow, this is a 6 years old issue... Reminder for myself to check how things are working today.

larowlan’s picture

Title: Entity hierarchy does not respect entity revisions » Explore storing distinct trees for each revision via a TREE ID column
Category: Bug report » Feature request
Priority: Major » Normal

\Drupal\entity_hierarchy\Form\HierarchyChildrenForm::form calls \Drupal\entity_hierarchy\Storage\EntityTreeNodeMapper::loadAndAccessCheckEntitysForTreeNodes which calls \Drupal\entity_hierarchy\Storage\EntityTreeNodeMapper::loadEntitiesForTreeNodesWithoutAccessChecks which has this code

$entities = $this->entityTypeManager->getStorage($entity_type_id)->loadMultiple(array_map(function (Node $node) {
      return $node->getId();
    }, $nodes));
    $loadedEntities = new \SplObjectStorage();
foreach ($nodes as $node) {
      $nodeId = $node->getId();
      $entity = isset($entities[$nodeId]) ? $entities[$nodeId] : FALSE;
      if (!$entity || ($entity->getEntityType()->hasKey('revision') && $node->getRevisionId() != $entity->getRevisionId())) {
        // Bypass non default revisions and deleted items.
        continue;
      }
      $loadedEntities[$node] = $entity;
      if ($cache) {
        $cache->addCacheableDependency($entity);
      }
    }

In this scenario $node is a nested set node from the tree and $entity is the Drupal entity.
We're calling loadMultiple which only returns default revisions and we're then dropping tree nodes if the revision ID doesn't match the default one.

This was changed in #2904307: Reorder children saves every revision needlessly in August 2017 which was after this was created

I suspect the issue regarding non default revisions being shown in the children tab no longer exists.

However, I'm still interested in the tree ID idea but on large sites the size of the tree and the number of rows to update can be problematic, so that would likely exacerbate the issue adding more records to the table.

I'm going to move this to a feature request to explore that.

larowlan’s picture

I wonder if #3411981: CTE Rewrite makes any of this easier?