Problem/Motivation

Issue found in 31d1114a36321403610c8167c6227f91ea9bc637.

When leaving a reply to a comment, any comments that follow the reply in the subsequent page reload will be indented incorrectly. For example, let's say we have three comments on a node.

|- A
|- B
|- C

A user responds to comment A with comment a1. We expect the indenting to look like this:

|- A
|--a1
|- B
|- C

Instead, all comments (B and C in this example) that follow a1 will be indented to the level of a1.

|- A
|-- a1
|-- B
|-- C

Clearing the site cache will bring the indentation back to the expected pattern:

|- A
|--a1
|- B
|- C

Investigation so far

Wim and I looked into this issue and found the following clues:

  1. We are missing a closing div on the indented comment, which causes all the following comments to indent as the browser "corrects" the malformed HTML and wraps the following siblings in the unclosed div:

    Screenshot of and IDE with the HTML from a comment thread. There is a point indicated where the closing div tag should be but is missing.

  2. If you delete a reply comment (a1) and then reply to the same comment (essentially recreate a1), the indentation is correct on the subsequent page load.
  3. We think the problem is somewhere in comment.module:comment_get_thread and comment.module:comment_prepare_thread. That's where the inclusion of closing divs is calculated.

Proposed resolution

Investigate further.

User interface changes

None.

API changes

None expected.

Original report by @jessebeach

Comments

wim leers’s picture

Priority: Normal » Major

I think this might be major, because it leaves the (potentially anonymous) commenter with a very jarring experience.

andypost’s picture

Actually 'divs' are added in \Drupal\comment\CommentViewBuilder::alterBuild

Interesting bug anyway

wim leers’s picture

#2: sure, but the calculations to determine if and how many closing </div>s happens in comment_prepare_thread().

Anonymous’s picture

I'm not able to reproduce this in a fresh 8.x-dev pull.

blueminds’s picture

Still an issue. One hint though - if you clear the cache the comments tree is rendered correctly.

blueminds’s picture

Oh, missed the part in the issue summary that the clear cache solves the problem...

I compared the values of $comment->divs and $comment->depth before and after the clear cache and they are identical.

before clear cache:

cid: depth divs
3: 0 0
  9: 1 1
  4: 0 -1
    5: 1 1
  2: 0 -1
    7: 1 1
  1: 0 -1
    8: 1 1
    6: 0 -1

after clear cache:

cid: depth divs
3: 0 0
  9: 1 1
4: 0 -1
  5: 1 1
2: 0 -1
  7: 1 1
1: 0 -1
  8: 1 1
  6: 0 -1

Also $comment->div_final before and after clear cache has the same value. So it looks like the problem is not in the comment_prepare_thread().

blueminds’s picture

I maybe located the problem.

Debugged CommentViewBuilder::alterBuild() and all seems to be built correctly including the closing divs. A closing div is added by the following comment that is supposed to jump out of the indentation. That leads me to an idea that as the following comment has not been updated the cache for it is still valid and therefore the old cached output is used that does not contain the "trailing closing" div. But what confuses me, if it is cached, why afterBuild() then triggers.

andypost’s picture

@blueminds awesome reseach!
Probably the problem in render cache, so some comments are rendered from cache and some (all parents of the new posted comment).

Maybe comment should clear render cache for all its parents?

blueminds’s picture

@andypost Actually not the parent is the problem, but the following comment in the tree as that is the one that adds the closing div

wim leers’s picture

Status: Active » Needs review
Issue tags: +D8 cacheability, +Quick fix
StatusFileSize
new896 bytes

It cannot be caused by the render cache; it *can* be caused by incorrect invalidation, i.e. missing cache tags.

This got me thinking, and sure enough… that's all it was. A super simple patch fixes the problem here.

A comment must not only return its "own" cache tag, but also the commented entity's cache tag, because it depends on that. In fact, it even refers to the commented entity:

<a href="/node/1#comment-5">Permalink</a>
wim leers’s picture

Thanks a *lot* to blueminds for debugging this — without his analysis, I wouldn't have thought of this :)

Status: Needs review » Needs work

The last submitted patch, 10: comment_indentation_bug-2254181-10.patch, failed testing.

wim leers’s picture

Status: Needs work » Needs review
Issue tags: +sprint
StatusFileSize
new2.85 KB
new2.92 KB

#10 was a bit too coarse. A comment itself doesn't

berdir’s picture

+++ b/core/modules/comment/lib/Drupal/comment/CommentViewBuilder.php
@@ -68,6 +69,20 @@ public function __construct(EntityTypeInterface $entity_type, EntityManagerInter
+
+    // Comments are rendered as part of a node, so whenever the node is modified
+    // (whether that be a change in the path alias or even a new comment being
+    // posted), ensure that this rendered comment is invalidated also.
+    $build['#cache']['tags'] = NestedArray::mergeDeep($build['#cache']['tags'], $entity->getCommentedEntity()->getCacheTag());

a node => the commented entity?

jessebeach’s picture

Nice work blueminds!

wim leers’s picture

StatusFileSize
new2.88 KB
new1.32 KB

#14: ugh, yes, of course. (I was doing this in between other things…)

This comment should be better.

berdir’s picture

Issue tags: +Needs tests

Do we need a test that the indentation is now correct? There might be some existing tests that we can extend?

I was confused by the first patch which added the commented entity manually to getCacheTags(), this should already happen through the referenced entities tag collection, right? That's why the new patch just makes sure that this cache tags also exists on the comment, so that we rebuild all comments if any comment or the commented entity changes? That's kind of sad because it means a lot of re-generating on every point (which means that the by-comment cache is hardly useful at all?)

wim leers’s picture

StatusFileSize
new2.91 KB
new525 bytes

Do we need a test that the indentation is now correct? There might be some existing tests that we can extend?

Indeed. We already have CommentThreadingTest which tests threading already, but doesn't test the presence of the correct <div class="indented"> tags.
@blueminds, would you like to work on that? :)

I was confused by the first patch which added the commented entity manually to getCacheTags(), this should already happen through the referenced entities tag collection, right?

Yes, the first patch was wrong.

That's why the new patch just makes sure that this cache tags also exists on the comment, so that we rebuild all comments if any comment or the commented entity changes?

Correct.
This is why I also considered just adding a comment_on_<entity type>:<entity ID> cache tag, which would have the same effect, but would be less confusing. That being said, every rendered comment must also be tagged with the commented entity's cache tag, because if the commented entity changes its URL, then each comment permalink will also need to be changed. Hence the need for what the current patch is doing.

That's kind of sad because it means a lot of re-generating on every point (which means that the by-comment cache is hardly useful at all?)

Correct & correct.
I think it'd make sense to disable render caching on comments for this reason. Done in this reroll.

blueminds’s picture

yup, will take a look at tests tonight or tomorrow morning.

catch’s picture

I think we should only add the cache tag for the commented entity when threading is enabled. Can't see a reason to invalidate the cache when a comment isn't in the thread.

wim leers’s picture

#20: Indeed.

blueminds’s picture

Here comes the test.

Status: Needs review » Needs work

The last submitted patch, 22: comment_indentation_bug-test_only-2254181-22.patch, failed testing.

blueminds’s picture

Status: Needs work » Needs review
wim leers’s picture

Status: Needs review » Needs work
StatusFileSize
new11.32 KB

@blueminds: Thanks! :) And sorry for having taken so long to provide a review — I lost track of this issue :( Next time, feel free to ping me on Twitter or IRC!


  1. +++ b/core/modules/comment/src/Tests/CommentTestBase.php
    @@ -179,11 +179,12 @@ public function postComment($entity, $comment, $subject = '', $contact = NULL, $
       function commentExists(CommentInterface $comment = NULL, $reply = FALSE) {
         if ($comment) {
    -      $regex = '/' . ($reply ? '<div class="indented">(.*?)' : '');
    +      $regex = '!' . ($reply ? '<div class="indented">(.*?)' : '');
           $regex .= '<a id="comment-' . $comment->id() . '"(.*?)';
           $regex .= $comment->getSubject() . '(.*?)';
           $regex .= $comment->comment_body->value . '(.*?)';
    -      $regex .= '/s';
    +      $regex .= ($reply ? '</article>\s</div>(.*?)' : '');
    +      $regex .= '!s';
    

    How does this help with but doesn't test the presence of the correct <div class="indented"> tags.?

  2. +++ b/core/modules/comment/src/Tests/CommentThreadingTest.php
    @@ -48,74 +47,75 @@ function testCommentThreading() {
    -    // Post comment #2 overall comment #5.
    -    $this->drupalLogin($this->web_user);
    +    // Post comment #3 overall comment #5.
    

    Above this point, you correctly incremented everything, to compensate for the new "comment #2" that you added.
    But starting here, you forgot to update the comment number in code comments and variable names. This is very confusing.

    Also: let's prefix each comment subject with the "#ID" we use in the code comments, that makes it much easier to understand.


Attached is a reroll to make the patch apply to HEAD again. In Comment,

  *     "bundle" = "field_id",

changed to

  *     "bundle" = "comment_type",

That's it :)

andypost’s picture

Status: Needs work » Needs review
andypost’s picture

Suppose the patch is good to go, except the reason to disable render cache. Suppose the problem here that comment renders indent but should not.

@Wim seem there should be a follow-up "enable render cache for comments and remove indent from single comment template"....

OTOH there's #1962846: Use field instance name for header of comment list, drop comment-wrapper template so new field template (comment wrapper) would care about comment indent.

+++ b/core/modules/comment/src/CommentViewBuilder.php
@@ -68,6 +69,21 @@ public function __construct(EntityTypeInterface $entity_type, EntityManagerInter
+  protected function getBuildDefaults(EntityInterface $entity, $view_mode, $langcode) {
...
+    $build['#cache']['tags'] = NestedArray::mergeDeep($build['#cache']['tags'], $entity->getCommentedEntity()->getCacheTag());

+++ b/core/modules/comment/src/Entity/Comment.php
@@ -38,6 +39,7 @@
+ *   render_cache = FALSE,

It's not clear do we need both changes?

Status: Needs review » Needs work

The last submitted patch, 25: comment_indentation_bug-2254181-25.patch, failed testing.

cbr’s picture

Assigned: Unassigned » cbr
Status: Needs work » Needs review
StatusFileSize
new11.51 KB

Rerolled

wim leers’s picture

Suppose the problem here that comment renders indent but should not.

You're right theoretically. Strange that none of us realized that before. Therefore, I agree with a enable render cache for comments and remove indent from single comment template follow-up in theory. Though having tried to do this just now, I think I know why: it's very tricky. We essentially need to render the comment, render cache it, and THEN add the prefix and suffix. That's not really supported.
Combined with the fact that we almost never render individual comments (they're rendered as part of the commented entity 99% of the time), I don't think this is worthwhile to pursue.

It's not clear do we need both changes?

The first is needed for correct cache invalidation (see #10).
The reasons for the second: please see berdir's remarks in #17 and my response in #18.

I think this is ready.

andypost’s picture

Assigned: cbr » catch
Status: Needs review » Reviewed & tested by the community

The issue fixes the bug, RTBC!
I agree that populating the render cache for comments is useless and could be only needed for forum (nonthreaded now)... probably

OTOH there's a log of issues to clean-up render for comments:
#2031883: Markup for: comment.html.twig
#2002094: Improve performance of comment.html.twig
#1920044: Move comment field settings that relate to rendering to formatter options
#1857946: Comment parent template variables are built twice

So let's get this in!

andypost’s picture

The issue #2318579: Remove comment_prepare_thread() moves the code to view builder

catch’s picture

Status: Reviewed & tested by the community » Needs work
+++ b/core/modules/comment/src/CommentViewBuilder.php
@@ -67,6 +68,21 @@ public function __construct(EntityTypeInterface $entity_type, EntityManagerInter
+    $build['#cache']['tags'] = NestedArray::mergeDeep($build['#cache']['tags'], $entity->getCommentedEntity()->getCacheTag());

Don't we only need to do this when threading is enabled for the field?

arla’s picture

Status: Needs work » Needs review
StatusFileSize
new11.48 KB
new1.81 KB

Rerolled and fixed #34. Discussed with @Berdir whether it is really only with threading enabled that we want to add the cache tags, and found no other case.

We also found out that in order to test the behaviour, you have to use a different user to trigger the cache clear, because the caches for rendered comments are tagged by user. Plus, to disable threading, you need to drush cedit or equivalent because of a bug in CommentItem.

Status: Needs review » Needs work

The last submitted patch, 35: comment_indentation_bug-2254181-35.patch, failed testing.

arla’s picture

Status: Needs work » Needs review
StatusFileSize
new12.32 KB
new857 bytes

Now that we merge in the tags of the entity, there is a fail because they are duplicated for the commented entity (since it in turn merges the tags of the comments). We could change drupal_merge_cache_tags() to eliminate duplicates, but I'm not sure that would be of advantage, and I'm guessing the simplicity of the function is justified. So this patch just makes the test expect the duplication.

wim leers’s picture

drupal_merge_cache_tags() already removes duplicates by design (Drupal core stores cache tags using their prefix as a key in an array), so I wonder how that's possible.

Could you debug this to find the root cause of this tag duplication problem?

arla’s picture

Did some debugging with @Berdir. Let me see if I get this straight.

Cache tags can be arrays of values, typically IDs of related entities. Such tags are sometimes expected to have their values as keys, e.g. array(1 => '1', 3 => '3'). In drupal_merge_cache_tags() they are, but in Entity::getCacheTag() they are not. This inconsistency is causing the duplication in this case, and might be problematic in other cases as well.

More specifically, if the $tags argument of drupal_merge_cache_tags() has something like array('entity_test' => array(0 => '1')) and the $other argument has array('entity_test' => array (1 => '1')) then the result will contain the value '1' twice.

The attached patch contains a fix for each of the two mentioned functions, as well as a passing version of CommentDefaultFormatterCacheTagsTest. The fixes work separately, i.e. you can apply any one of them to make the test pass. On the other hand, we expect fails from other tests, which expect cache tags to be set in some specific way or other.

To explain the drupal_merge_cache_tags() fix, this version does not expect arrays in $tags to already have their values as keys.

Let's have some feedback on this and then possibly create a new issue addressing the format of cache tag arrays.

Status: Needs review » Needs work

The last submitted patch, 39: comment_indentation_bug-2254181-39.patch, failed testing.

wim leers’s picture

Wow, excellent detective work!

This is why I've been wanting to open an issue to change all cache tags across core, to just be strings. i.e. node:5 instead of ['node' => [5 => 5]] (or ['node' => [5]], which is what HEAD does). Then we can just array_unique(array_merge($tags, $more_tags)) and not worry about the array aspects.

Because there's also the problem that if any code does ['foo' => TRUE'], then any other code no longer can do ['foo' => [35 => 35]]. We've had core bugs because of that. When using strings, we avoid all that.

Thoughts?

arla’s picture

I'm still quite new to the caching mechanism, but using strings only makes sense as far as I can see. I suppose that the following example (from Cache API doc):

$tags = array(
  'my_custom_tag' => TRUE,
  'node' => array(1, 3),
  'user' => array(7),
);

would then be:

$tags = array(
  'my_custom_tag',
  'node:1',
  'node:3',
  'user:7',
);
wim leers’s picture

Exactly!

wim leers’s picture

Opened an issue for this: #2340123: Setting cache tags can be tricky: use strings instead of nested arrays to improve DX. Discussed with catch, he's on board with this change.

berdir’s picture

Ok, how do we continue here then? I guess we can go with a minimal fix as we hope it will get simplifed in that issue?

Maybe we can get back to #37, with a @todo on the new issue?

andypost’s picture

Maybe it's time to fix a cause (wrong cache clean for indented comments) and left @todo

Indenting should be a field level post render callback somehow...

  1. +++ b/core/includes/common.inc
    @@ -1842,19 +1842,32 @@ function drupal_merge_attached(array $a, array $b) {
     function drupal_merge_cache_tags(array $tags, array $other) {
    +  $merged_tags = array();
    ...
    +  return $merged_tags;
    

    let's move that to other issue with @todo

  2. +++ b/core/lib/Drupal/Core/Entity/Entity.php
    @@ -404,7 +404,7 @@ public function referencedEntities() {
    -    return array($this->entityTypeId => array($this->id()));
    +    return array($this->entityTypeId => array($this->id() => $this->id()));
    

    is this related to bug?

  3. +++ b/core/modules/comment/src/CommentViewBuilder.php
    @@ -67,6 +68,22 @@ public function __construct(EntityTypeInterface $entity_type, EntityManagerInter
    +  protected function getBuildDefaults(EntityInterface $entity, $view_mode, $langcode) {
    ...
    +    if ($entity->getCommentedEntity()->getFieldDefinition($entity->getFieldName())->getSetting('default_mode')) {
    
    +++ b/core/modules/comment/src/Entity/Comment.php
    @@ -40,6 +40,7 @@
    + *   render_cache = FALSE,
    

    this setting lives on field level but prefer at formatter level.
    I see no reason to cache indented comments and remove render cache fr comments, this is a formatter level cache that needs proper cache clean

wim leers’s picture

#2340123: Setting cache tags can be tricky: use strings instead of nested arrays to improve DX now has a patch! Please help land it by reviewing!


#45: I'm fine with committing #37.


#46:
  1. Related to #2340123: Setting cache tags can be tricky: use strings instead of nested arrays to improve DX.
  2. Related to #2340123: Setting cache tags can be tricky: use strings instead of nested arrays to improve DX.
  3. I see no reason to cache indented comments and remove render cache fr comments, this is a formatter level cache that needs proper cache clean

    This is a bit difficult to parse. I don't understand everything, but I do understand you don't want render caching to be disabled on comments. But… comments are never rendered independently (except when replying to a specific comment, which is relatively rare). Rendered comments are already render cached as part of a render cached parent entity (usually a node). So what's the value in keeping render caching for comments?

arla’s picture

Status: Needs work » Needs review
StatusFileSize
new12.38 KB
new853 bytes

This patch continues from #37 and adds a @todo issue reference.

wim leers’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: -Needs tests

Thank you!

catch’s picture

Status: Reviewed & tested by the community » Needs review

But… comments are never rendered independently (except when replying to a specific comment, which is relatively rare). Rendered comments are already render cached as part of a render cached parent entity (usually a node). So what's the value in keeping render caching for comments?

298-comment long issue thread on Drupal.org, where threading is not enabled.

298 comments are on that thread, they're cached in both the node render cache as a single chunk of HTML, and individually.

Comment 299 is posted.

With render caching of comments, the 298 comments are loaded from render cache. Without, we have to separately render each one.

Same principle applies to a blog or other site that tends to have one or two highly commented articles at a time.

  1. +++ b/core/modules/comment/src/Entity/Comment.php
    @@ -40,6 +40,7 @@
    + *   render_cache = FALSE,
    

    If this is set.

  2. +++ b/core/modules/comment/src/CommentViewBuilder.php
    @@ -67,6 +68,22 @@ public function __construct(EntityTypeInterface $entity_type, EntityManagerInter
    +    // threading is enabled, make sure the rendered comments are invalidated as
    

    Then why do we also need this?

I'd prefer to keep the render caching and just set the cache tags. It's more cache sets but that has limited impact on the sites that don't need it and could keep the sites that do up.

berdir’s picture

Wondering if we should have an easier way to conditionally disable the render cache per-entity, right now, all we have is the global flag and by-view mode.

But yes, layered caching makes a lot of sense when you don't show threaded comments. Wondering if we can find a relatively easy way to block caching completely under certain conditions, then we could prevent it just for threaded comments?

wim leers’s picture

Title: Comment indentation is incorrect for comments following a replying-comment. » Comment indentation is incorrect for comments following a replying-comment: don't render cache comments for which threading is enabled
StatusFileSize
new12.07 KB
new1.68 KB

With render caching of comments, the 298 comments are loaded from render cache. Without, we have to separately render each one.

That is an excellent point.

BUT: the indentation wrapper for a comment is also render cached. So we can only do this when threading is disabled.

This is also what Berdir alludes to: But yes, layered caching makes a lot of sense when you don't show threaded comments. .

catch’s picture

@Wim yes I also said this in #20 ;)

+++ b/core/modules/comment/src/CommentViewBuilder.php
@@ -67,6 +68,20 @@ public function __construct(EntityTypeInterface $entity_type, EntityManagerInter
+      unset($build['#cache']);

Won't this unset the cache tags as well? Also this needs an explicit comment about the indentation markup being part of the rendered comment.

Status: Needs review » Needs work

The last submitted patch, 52: comment_indentation_bug-2254181-52.patch, failed testing.

wim leers’s picture

Status: Needs work » Needs review
StatusFileSize
new12.22 KB
new3.86 KB

#53:

  • Sorry :( This issue went silent for a long time…
  • You're right, and our test coverage caught that :) (The test entity has the expected cache tags when it has comments. is one of the failures.)

This should be green again.

berdir’s picture

Status: Needs review » Needs work

Did test this manually, turns out that the type safe check breaks things on a real installation because the schema of default_mode currently says boolean, while the constants are TRUE/FALSE.

So we either have to fix the schema, the constants or not do a type safe check. I'm OK with changing the schema, but I guess we could also change the constants? Type safe check is a good idea, although not sure if we should then do it everywhere consistently, which might expose new test fails, making the patch bigger again...

+++ b/core/modules/comment/src/CommentViewBuilder.php
@@ -67,6 +68,23 @@ public function __construct(EntityTypeInterface $entity_type, EntityManagerInter
   /**
...
+      $cache_tags = $build['#cache']['tags'];
+      $build['#cache'] = [];
+      $build['#cache']['tags'] = $cache_tags;

Also, as mentioned on IRC, should we open a follow-up issue to add a kill switch to #cache? It would be easy for someone implementing hook_entity_build_defaults_alter() to re-add a key and then it would be cached, just based on that key ;)

A kill switch is much harder to accidentally overwrite.

(I also think this is more complex than it needs to be, the system just cares about the existence of 'keys', but @WimLeers prefers it to be like this and I don't care that much)

larowlan’s picture

+1 for changing the schema, allows for other options in contrib that a boolean doesn't

catch’s picture

Follow-up to add a kill-switch sounds great.

wim leers’s picture

Status: Needs work » Needs review
StatusFileSize
new12.75 KB
new595 bytes
berdir’s picture

Status: Needs review » Reviewed & tested by the community

Thanks, I tested the same fix locally, worked for me then.

catch’s picture

Status: Reviewed & tested by the community » Fixed

Committed/pushed to 8.0.x, thanks!

catch’s picture

Issue tags: -Quick fix

  • catch committed 6398a19 on 8.0.x
    Issue #2254181 by Wim Leers, Arla, blueminds, cbr: Fixed Comment...
berdir’s picture

wim leers’s picture

Issue tags: -sprint

#62: yeah, this turned out to not be a "quick fix" after all…

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.