Following #2394453: Test Mollom on Drupal.org and #1694494: Install Mollom on Drupal.org, Mollom is now installed on Drupal.org. This issue is about trying out different configurations and selecting the best that fits our needs.

Current configuration:

  • User roles which are being affected by Mollom: email unverified and authenticated. User roles 'confirmed' and up skip Mollom protection.
  • Forms which are being protected right now:
    • User registration form
    • Forum topic comment form
    • Forum topic node form
    • Issue edit/comment form
  • Protection settings: strict. When text analysis is unsure - accept the post. When post identified as spam - retain for manual moderation.

The goal is to support #2386793: Modify user role progression on Drupal.org. We will only be trying Mollom on content creation forms, which will be available to email unverified and authenticated user roles per the new role progression.

CommentFileSizeAuthor
#34 spam1.png42.58 KBWorldFallz
#33 spam.png26.04 KBWorldFallz
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

tvn’s picture

Mollom protection on Forum topic comment form was enabled for the last 4 days. It seems to be doing ok. It did catch a few spam comments. It also did mark 2 or 3 valid comments as spam, I published those and gave their author 'trusted' role so they never will face this problem again. Based on the latest spam reports in Webmasters queue it seems that most of our spam currently are actually forum nodes. I enabled Mollom protection for Forum topic node form just now and will watch it closely through the day.

tvn’s picture

tvn’s picture

Issue summary: View changes
tvn’s picture

Issue summary: View changes
WorldFallz’s picture

Where do these mollom moderated comments end up? The core comments page? Is checking this something webmasters should be doing as well?

tvn’s picture

It is 'Unapproved comments' list on https://www.drupal.org/admin/content/comment/approval. Sure, webmasters are welcome to check it :)

That list is full of comments unpublished at different times (cleaning up those would be a good thing to do in itself, 77 pages of unpublished comments starting from 2005, those aren't helpful).

WorldFallz’s picture

awesome... I'll take a look when i get a chance. And agreed-- cleaning up those old comments is definitely a good idea!

tvn’s picture

Yesterday we enabled permission for 'email unverified' users to create forum nodes. Let's see how it'll go. So far all the spammers which tried to post something had their emails verified.

WorldFallz’s picture

Not sure where to post this, but I'm in the process of cleaning up a MAJOR spam incursion. Tons of clear and obvious spam (for an example see: http://drupal.org/user/316405/admin-nodes) spread across multiple users-- the likes of which we haven't seen since killes fixed the honeypot module (or was it the spam module) when we were being inundated with spam from Vietnam.

That mollom didn't catch this is quite concerning. Is mollom the only spam protection we have at this point?

dddave’s picture

https://www.drupal.org/u/nfklgdflkf is a good example why I don't care for Mollom.

tvn’s picture

Re: #9. Was that spam in forum posts or other content types? Honeypot is still here, no changes to it. Mollom was enabled in addition.

tvn’s picture

Looking at https://www.drupal.org/admin/reports/dblog?page=2 (filter by type=Mollom) it actually caught some of this spam, looks like at least half.

WorldFallz’s picture

thanks for the reply t! Good to know the other spam protection is still installed. The nodes are gone (even the ones I left unpublished), or I would create an issue to have killes add some markers to it. It was a couple of dozen forum threads spread across a handful or so of users -- usually less than a minute apart. I'll keep an eye for more.

geerlingguy’s picture

@WordFallz; just FYI, I've seen a similar pattern on some other community sites, and it's tough, because neither Honeypot nor Mollom can effectively deal with a highly-targeted human-based spam attack like that (most of the time).

I've tried (and mostly succeeded, with the help of D.o maintainers) in making Honeypot's API flexible enough that you can add in some extra logic on top to try to stymie these kinds of attacks, but it's definitely a hard problem to solve for a site like Drupal.org, which is a huge target. Keep reporting the issues, and and hopefully we'll find ways to make it better in the future :)

tvn’s picture

Issue summary: View changes

Seems Mollom is doing ok now at identifying as spam posts from this recent non-English spam attack. 'Reporting as spam' to Mollom definitely helped to teach it. Note that we run 'relaxed' settings, so often Mollom would mark form submission as spam, accept it, but keep unpublished. Only webmasters and admins can see those posts, so when you see a lot of unpublished spam posts - Mollom is actually doing its job.

It still regularly has a few false positives, when legitimate posts or comments are marked as spam and unpublished. So I would not recommend changing 'relaxed' settings to something more strict for now.

For the record, we haven't changed any permissions or settings, so the issue summary is still accurate.

tvn’s picture

Also, it is safe to delete spam forum posts right away. A copy of the post is saved with Mollom watchdog entry on post submission or edit, so we can always look at those if we want to do any sort of analysis.

WorldFallz’s picture

The dozens of posts I removed were actually published-- so they weren't flagged at all. That's what concerned me. I unpublished some to keep as samples, sorry if I was unclear in explaining that.

And yes, mollom seems to be doing much better.

But I just thought of something. When I moderate a singular post i get the mollom report page. When I do it from admin-comments or admin-nodes I don't-- does that mean that those are bypassing the mollom report? Or is it done automatically?

tvn’s picture

When you do bulk deletion - those bypass Mollom report. If you go to admin-nodes and click 'delete' next to individual node in the list, you'll go through Mollom report.

WorldFallz’s picture

Thanks for the quick reply-- good to know. For a smaller amount of posts I'll be sure to do the individual deletes. It's a bit tough to do that when it's more than a handful though, but luckily we haven't been getting many that large.

greggles’s picture

It should be possible to configure some VBO so that you can do bulk deletion/report as spam - https://www.drupal.org/node/655846

tvn’s picture

I had a chat with Nick_vh from Mollom team about the ways we can make it more efficient on D.o considering the latest spam attack.

It sounds like we should try out a bit less relaxed settings, so that it would let less spam through, but to do that we'd need to have less false positives first. I opened a couple of issues:

Let's collect examples of false positives here so that Mollom team could take a look and analyze why they are marked as spam: #2529104: Collect Mollom false positives

Also I'll try to find examples of another weird behavior - when form submission is being identified as spam and rejected. Per our settings all submissions should be accepted now, but unpublished #2529112: Collect Mollom false negatives.

Lastly, we'll look into ways to whitelist certain urls for Mollom. Unfortunately that is not too easy, discussion is at #2529118: Whitelist certain urls for Mollom spam checks .

Another idea we could consider is enable Captcha on submissions where Mollom is not sure, and write some custom code to only show those when submission is in a specific language (e.g. Chinese).

WorldFallz’s picture

Thanks for taking the lead on this lead t! All your proposals sound awesome and will hopefully benefit mollom as well.

With regards to the captcha, I would even say we could show captcha for flagged posts in any language other than english. we only get maybe a handful of legit non-english posts in a 6 month period.

tvn’s picture

Issue summary: View changes

In the last few days Mollom was successfully blocking attempts to post massive non-English spam on forums, so today spammers switched to issue queues. Luckily they started with webmasters queue, so we were able to notice it quickly. They even managed to edit a few issues back and forth with me, hilarious. I enabled Mollom on issue comments for the time being.

tvn’s picture

Actually, we don't use default comment form on issues, that is a modified node edit form itself. So we should protect that one instead. Seems like spammers went quiet for now, I'll wait a bit before turning Mollom on the issue edit form.

tvn’s picture

Issue summary: View changes

Okay fine, enabling Mollom on issue edit form too.

webchick’s picture

I love when spammers report themselves. :P

tvn’s picture

I've said this before, we just need to teach them to block themselves, and we'll be done.

tvn’s picture

Issue summary: View changes

It appears that Mollom does not actually check issue form submissions, potentially because we customized issue node form to also be edit form. We'll have to look into this #2535612: Check issue submissions / edits / comments for spam using Mollom.

dddave’s picture

For the next jibber-jabber you have with the Mollom folks please ask why nodes like https://www.drupal.org/node/2536848 or https://www.drupal.org/node/2536850 don't get caught. I've reported dozens of similar nodes by now so Mollom should have learnt that by now.

edit: Please https://www.drupal.org/node/2538948

dddave’s picture

Just for the record I've recorded for the umptienth time a spam node with an examguidez.com link in it...

edit: Reported multiple nodes before deleting but left this one for analysis: https://www.drupal.org/node/2540272

WorldFallz’s picture

Issue summary: View changes
FileSize
26.04 KB

And once again, we've been getting tons of these passing through the last couple of days:

I know training can be a tricky thing, but these are virtually cookie cutter posts-- probably even down to the number of characters. Plus one inline link in the single paragraph, and one URL at the end.

Seems like this is something mollom should be handling quite easily.

WorldFallz’s picture

Issue summary: View changes
FileSize
42.58 KB

There has to be something mollom can do about this. It's been going on for weeks, and I've been dutifully reporting each one I delete.

tvn’s picture

Quick update: during DrupalCon Barcelona drumm and I had a chat with Nick_vh from Mollom, we looked through examples of false positives on Drupal.org, why Mollom flagged those posts as spam on their end, etc.

Nothing really new, we confirmed our previous plans: top priority right now is #2529118: Whitelist certain urls for Mollom spam checks . Most of the false positives were flagged as spam because they had valid urls in them. After that issue is done, we can change Mollom settings from relaxed to more strict. Additionally, Nick recommended we consider enabling captcha for the case when Mollom is unsure if a post is spam or not. Right now when unsure Mollom marks posts as spam.

Re: last two comments - we haven't talked about those examples. I'll set a few of them to Nick to see if he can give us any info.

dddave’s picture

After looking through the unpublished nodes and specifically after this issue #2592187: Request to republish project application I am under the impression the tightened Mollom rules we are using to fight the current mega spam wave are unpublishing already published nodes (sommetimes). Could that be?

B_man’s picture

I have been reporting published nodes to mollom during the unpublish process. There was one instance where I unpublished something legitimate on accident, but that was republished and I left a comment explaining what happened.

tvn’s picture

Quick update on this:
We deployed a solution for whitelisting specific urls for Mollom. Drupal.org and stackexchange.org are now whitelisted, so hopefully there will be less false positives caused by this.
We also fixed the bug with Mollom not checking issue form submissions even when enabled for it.
Following that we changed protection settings to strict. And when text analysis unsure on forum posts we now display captcha.

After looking through the unpublished nodes and specifically after this issue #2592187: Request to republish project application I am under the impression the tightened Mollom rules we are using to fight the current mega spam wave are unpublishing already published nodes (sommetimes). Could that be?

I could imagine that happening if someone was editing already published node, Mollom could perhaps classify edits as spam, not sure if it would unpublish already published thing though. It could have also been an accident, I've done that too, like Brendan mentioned above. Can you give more examples if you see them.

dddave’s picture

This issue was unpublished https://www.drupal.org/node/2598184 for no good reason. I've published it and approved the account. Look how much content went through without problems. I only had to publish one issue comment with a patch attached.

tvn’s picture

Weird, that one doesn't even have any url. I'll check with Mollom people what was the reason there.

tvn’s picture

Issue summary: View changes

Checked on that node, it was pure confusion on Mollom's part. It did register our feedback that the node is not spam when you published it. So hopefully it'll learn..

dddave’s picture

Still happening: https://www.drupal.org/node/2617746 (this comment was also unpublished).

This hapens more often than I report here but if I check the approval queue on my tablet I most likely won't report these issues here.

geerlingguy’s picture

Just a general FYI—across multiple sites, I've noticed a very large and sudden increase in spam activity over the past 3-4 weeks. It's getting to the point where using Honeypot + Mollom and even some custom coding to try to outwit them is not working. It's like there's some giant human spam cartel that's suddenly active :(

B_man’s picture

We are still working on spam prevention and have some new measures coming online soon to help us mitigate the spam. Apologies for intentionally being vague about what those measures are, it is best not to expose them to avoid accelerating subversion. Thanks geerlingguy for the observations I have noticed that too (and not just on d.o and it's subsites).

dddave’s picture

Mollom mysteries: https://www.drupal.org/node/2496879#comment-10617550

edit: noticed a couple of unpublished posts like this https://www.drupal.org/node/2627176 (also comments) that were blocked but upon resubmission a bit later went through unchanged. Does this make sense?

dddave’s picture

apaderno’s picture

Assigned: tvn » Unassigned
Status: Active » Closed (outdated)

Since the Mollom module has been uninstalled, I am closing this issue.