CommentBlockFunctionalTest fails randomly, something like 1 in 10 test runs, it should not do that.

Marking critical because apparently randomly failing tests stop critical bugs from getting committed.

If there is not an obvious fix, we should comment out the bad assertions until it can be made to work.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

catch’s picture

mr.baileys’s picture

Status: Active » Needs review
FileSize
690 bytes

Haven't yet been able to reproduce this locally, but here's a guess of what could be causing the issue:

The test adds three comments, and then verifies that only the 2 most recent comments are shown in the Recent Comments block. The recent comments block uses comments_get_recent, which in turn uses the following query:

  $query = db_select('comment', 'c');
  $query->innerJoin('node', 'n', 'n.nid = c.nid');
  $query->addTag('node_access');
  $comments = $query
    ->fields('c')
    ->condition('c.status', COMMENT_PUBLISHED)
    ->condition('n.status', NODE_PUBLISHED)
    ->orderBy('c.created', 'DESC')
    ->range(0, $number)
    ->execute()
    ->fetchAll();

When inserting 3 comments programmatically, they may share the same timestamp, and the above query no longer guarantees that they are ordered from most recent to older.

Suggested fix attached.

tstoeckler’s picture

FileSize
683 bytes

Here's a patch which adds a usleep(1) in between the comment Creating to ensure they have a different timestamp.

mr.baileys’s picture

Well, if our guess is correct, and the issue is caused by multiple comments having the same timestamp, then this is a bug in comment_get_recent() (not correctly ordering results) and I'd rather fix it there than work around the issue by altering the test.

tstoeckler’s picture

Well at first ordering by cid seemed somewhat random to me, but I guess it is reasonable to assume that the cid of a later comment is always higher than that of a previous one, so I guess #2 looks good.

catch’s picture

We were ordering by cid at one point, but that was reverted, I can't find the issue though but didn't look through git history yet.

It'd be good to run an explain on the query, it may be indexed enough as it is, but would be good to confirm.

Two comments posted in the same second can't be replies to each other or otherwise meaningfully ordered IMO, so it feels like a testing-only bug to me, even if the semantic ordering is wrong, but either way we fix this isn't going to be perfect.

One thing we could maybe do in the test other than the sleep() is set the timestamp when posting the comments maybe?

mr.baileys’s picture

The issue you are referring to (that removed cid from ordering) is #1149912: Recent comments block displays old, out-of-order comments.

catch’s picture

hmm, I wonder if that issue introduced the random test failure then - 4-5 weeks ago sounds about right.

If we were ordering by cid that recently, let's just add it back again.

Side note, doesn't look like the index added in that patch would do anything, the query is something like:

SELECT * FROM comment c INNER JOIN node n on c.nid = n.nid WHERE c.status = 1 and n.status = 1 ORDER BY created DESC.

And there is already an index on array('nid', 'status', 'created', 'cid', 'thread').

So it'd be good to get an EXPLAIN on the new query, if it's using that index, then we should drop the {comment}.created index. We could do that here, or re-open the other issue.

aspilicious’s picture

Yeah the tests started failing a few weeks ago.

mr.baileys’s picture

Explain for the query (with the patch from #2 applied) and 250 nodes.

SELECT c.* FROM comment c INNER JOIN node n ON n.nid = c.nid WHERE (c.status = '1') AND (n.status = '1') ORDER BY c.created DESC, c.cid DESC LIMIT 10 OFFSET 0
select_type table type possible_keys key key_len ref rows extra
SIMPLE n ref PRIMARY, node_status_type node_status_type 4 const 18 Using index; Using temporary; Using filesort
SIMPLE c ref comment_num_new, comment_nid_language comment_num_new 5 d8.n.nid,const 1 Using where
catch’s picture

OK so the index added in the other issue is not doing anything at all, that's not critical but we should either remove it again here or add a task somewhere to clean it up.

The node section of that explain doesn't look very nice, but that is not really new - is going to be the same in the old versions of the query. Would either need to denormalize node.status to comment, or consider doing some filtering in PHP (how many new comments are really likely to be on an unpublished node?), but that's also another issue.

sun’s picture

So #2 + an update to remove the index for this issue?

sun’s picture

Though, odd. The EXPLAIN posted in #1149912-4: Recent comments block displays old, out-of-order comments looked entirely different... sorry, relied on that.

mr.baileys’s picture

Just verified that the EXPLAIN for current HEAD (so without the patch from #2) is the same as the EXPLAIN posted in #11 (so with the patch from #2 applied)

I'd prefer have #2 committed to fix the random test failures (and is considered critical), and then re-open #1149912: Recent comments block displays old, out-of-order comments to hash out the index stuff (An update function was added to add the index. Now the next update function would remove it again. Let's take some time to make sure we get it right to prevent 3 index-altering update functions in a row :) )

catch’s picture

Issue tags: +Needs backport to D7

Re-opened the other issue to discuss the index.

Since this restores ordering back to where it was three weeks ago (minus the bug the other issue fixed), the EXPLAIN here is as good as it's going to get with the current query, and the other issue is going to need the indexes properly examining anyway, this is RTBC.

Does not need tests because it already has them and they're failing due to the code, just not every single test run. If by some bizarre chance there is another reason that test fails, we'll soon find out.

However:

+    ->orderBy('c.cid', 'DESC')

Let's add a comment just above here so someone doesn't come along and remove it again. Something like

"Additionally order by cid so that comments posted with the same timestamp are shown in the exact order posted"?

mr.baileys’s picture

Makes sense, patch attached with the additional comment on why we are introducing c.cid ordering.

catch’s picture

Status: Needs review » Reviewed & tested by the community

Thanks!

grendzy’s picture

I think the code comment should explain the cid order is needed for SimpleTest. Otherwise, the reader is left wondering why exact ordering matters.

sun’s picture

The existing comment sufficiently explains the additional orderBy for me.

rfay’s picture

This breaks the testbot fairly regularly. Let's get it into both 8.x and 7.x

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to 7.x and 8.x. Thanks!

Automatically closed -- issue fixed for 2 weeks with no activity.