The automated unpublishing is great for stopping spam from being displayed, but it doesn't stop spam from corrupting another feature: my "recent posts" (tracker.module). I'll regularly get bursts of 200 auto-unpublished spams which make the "recent posts" page useless - an unpublished spam still affects the modification date of a node, which makes "recent posts" display craploads of updates, even though there really isn't any. Thus, the feature request would be:

- treat an unpublished spam as a deleted spam
- when a spam is automatically unpublished, revert the node modification date to its previous value.

I'm not entirely sure how easily possible this is, as you'd have to make sure the date is properly set when someone says "nah, this unpublished spam was actual ham", and then republishes it.

CommentFileSizeAuthor
#18 tracker.diff2.67 KBZed Pobre
#14 tracker_0.patch2.59 KBMorbus Iff
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Jeremy’s picture

This is a known and annoying bug. As it affects me too, I assure you I'll get to fixing this eventually...

Any ideas on how to cleanly implement this are welcome.

Morbus Iff’s picture

Would it seem to you that this behavior should be part of core? Ignoring spam.module entirely, if a node is updated at 11:00, and then updated with a comment at 12:00, the 12:00 is saved in there as the node's modification time. But if a comment is unpublished, shouldn't the node also be reverted back to its previous state (being 11:00)? How is the deletion of data (permanently hiding it) different from merely temporarily hiding it (ie., unpublishing it)?

Perhaps I should move request to Drupal.

Jeremy’s picture

Feel free to move it. If I come up with a clean solution (at least as far as spam is concerned), I'll update the issue wherever it is.

Morbus Iff’s picture

Was talking about this with killes (who "frankly [doesn't] care too much about comments"), and I'm now wondering if I'm going nutty with all this talk about node and comment timestamps. We should investigate to see if tracker.module looks at /unpublished/ comment timestamps instead of just published ones. If it doesn't make a distinction, then it would theoretically be a quickie patch to a SQL statement to only check against published comment timestamps, not unpublished.

Morbus Iff’s picture

With a quick look, it looks like tracker_page() only checks node.status, and not comment.status. We should be able to put a patch in for comment.status = 1 (unpublished being 2), and that should solve our problem, right?

Morbus Iff’s picture

Apologies... "AND c.status = 0". Modifying the two queries in tracker_page() made things, seemingly, work as I desire. Can you confirm?

Jeremy’s picture

It didn't appear to help with regards to the spam module. I had a recent spam flood, and with or without the 'AND c.status = 0' I saw the same posts, including those with recently unpublished spam comments.

Morbus Iff’s picture

Odd - I had one unpublished piece of spam that I tested against (for 4.5.0, not CVS), and modifying tracker.module seemed to work as I intended. So, the following bit of SQL still shows you erroneous nodes with unpublished comments?

SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, MAX(GREATEST(n.changed, c.timestamp)) AS last_post FROM node n LEFT JOIN comments c ON n.nid = c.nid INNER JOIN users u ON n.uid = u.uid WHERE n.status = 1 AND c.status = 0 AND '1' GROUP BY n.nid, n.title, n.type, n.changed, n.uid, u.name ORDER BY last_post DESC LIMIT 10;

The above is what is actually run against my database with the proposed tracker.module change. (This code is currently active on gamegrene.com, so we can test live if need be. Submit a comment as an anonymous user with the words "poker" or "texas hold'em", and it should be a) unpublished automatically via your module, and b) not shown in the tracker.

Morbus Iff’s picture

For what it's worth, I just woke up to another flood of 150 spams, and no corruption of my recent posts.

Jeremy’s picture

Then it sounds like a patch is in order... For the tracker module.

Morbus Iff’s picture

Have you been able to confirm this though?

Morbus Iff’s picture

Project: Spam » Drupal core
Component: Code » tracker.module
Assigned: Unassigned » Morbus Iff
Category: feature » bug
Morbus Iff’s picture

Title: Revert modification time after unpublishing » Ignore unpublished comments when determining last_post
Morbus Iff’s picture

FileSize
2.59 KB

Patch attached for 4.5.0. Haven't tried applying it to CVS, but it needs to go there too.

killes@www.drop.org’s picture

I think it makes sense to only select published comments. I think this bugfix should go in both 4.5 and cvs. +1

Dries’s picture

Committed to DRUPAL-4-5.

grohk’s picture

Reopening this issue. I just upgraded to 4.5.2 and noticed that this patch did indeed filter out the nodes with unpublished comments, but it also seems to be causing a new problem in the tracker. I have observed that new posts do not show up in the tracker until they have been commented on. I rolled back the tracker.module installed on my site to revision 1.100 and the problem was resolved, but the phantom comments (unpublished spam) were back. I am going to do some more investigations, but I am rather inexperienced in this realm. Anyone have an ideas?

Zed Pobre’s picture

FileSize
2.67 KB

To fix this, instead of just checking c.status = 0, check (c.status = 0 OR c.status IS NULL). I've attached a patch to do this.

grohk’s picture

So far so good. This patch seems to have fixed my tracker. Thanks Zed, I really appreciate it.

Jeremy’s picture

Agreed, the updated patch seems to be working perfectly for me. Now these constant floods of spam don't render my tracker useless. Thanks! :)

Steven’s picture

Applied to 4.5 and HEAD.

pamphile’s picture

Assigned: Morbus Iff » pamphile

The patch worked for me too...

Anonymous’s picture

paul-at-murphymaphia.com’s picture

Version: » 4.5.6

It seems that some versions of MySQL return a NULL value as a result from MAX(GREATEST(n.created, c.timestamp)) if there are no comments (c.timestamp = NULL)

I propose modifying the query to change:

MAX(GREATEST(n.created, c.timestamp)) AS last_post

to

IF(MAX(GREATEST(n.created, c.timestamp)) IS NULL, n.created, MAX(GREATEST(n.created, c.timestamp))) AS last_post