If there are thousands of comments on a node, and a site editor disables commenting on the node, without deleting or disabling the comments themselves, a cron run will choke and run out of memory, as ALL the comments are loaded into memory at once.

Site editors who have run into spam problems, who find that there are thousands of spam comments on a node, will naturally edit the node and disable comments, rather than the more lengthy process of deleting all the problematic nodes. This is not the "right" way to do it, but this is what the end user will do, every time, because it's easy.

search_cron() will actually invoke comment_nodeapi($op = 'update index') when the comment module is enabled. If this happens, this case will run:

case 'update index':
  $text = '';
  $comments = db_query('SELECT subject, comment, format FROM {comments} WHERE nid = %d AND status = %d', $node->nid, COMMENT_PUBLISHED);
  while ($comment = db_fetch_object($comments)) {
    $text .= '<h2>' . check_plain($comment->subject) . '</h2>' . check_markup($comment->comment, $comment->format, FALSE);
  }
  return $text;

This selects all the comments for a given node. This is then passed to the search indexer, which runs a lot of preg_split() on them, quickly filling up the available memory.

I tried this with 12MB of comments, 1200 in total (extreme, but not unreasonable for spammers), and a memory limit of 128MB, and it bailed every time.

The end result of this is that cron fails consistently, preventing other cron tasks from running, and preventing the search index from being updated.

I'd suggest that the PHP code above should check whether the node has comments disabled, and refuse to index comments on that node unless its comments are at least 'read only'.

Patch attached. Please go easy, as I have not submitted a patch to core before. Any tips on improving patches would be appreciated.

CommentFileSizeAuthor
comment-cron-denial.patch1.34 KBchriscohen
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Devin Carlson’s picture

Title: Denial of cron possible by comment spam » Disabling comments on a node with a large number of comments causes cron to fail (memory exhausted)
Version: 6.22 » 8.x-dev

Thanks for the patch! It sounds like this could be an issue on a number of sites.

Could you make a patch against Drupal 8 head? All proposed changes to Drupal core should be posted against the current development version (currently Drupal 8) and then backported to other versions as necessary.

Please see the Backport policy for more information.

catch’s picture

Title: Disabling comments on a node with a large number of comments causes cron to fail (memory exhausted) » Search module loads every comment into memory

I think there's an issue for this somewhere else, but can't locate it for now.

Not even bothering to check comments if they're marked as hidden is a good idea, but doesn't fix the root cause - updating the node in any way at all would also cause this problem.

For that we'd need to batch the comments and concat the HTML together so that memory can be freed (comments aren't statically cached iirc so that ought to help). That could be a separate issue though, this seems reasonable as a quick fix.

chriscohen’s picture

I'm afraid I don't have enough D8 knowledge to be comfortable putting together a patch for this, but the patch for Drupal 6 was fairly quick to put in place, so I'm hoping the Drupal 8 one will be too, although as catch mentions, it's only a quick fix for a deeper issue.

dixon_’s picture

subscribe

Devin Carlson’s picture

andypost’s picture

Status: Active » Closed (works as designed)

There's entity cache for that, we need a whole comments because they can have fields.
PS: feel free to reopen.