On my personal site (www.jeffgeerling.com), now that Mollom is out of the picture, I've noticed that most of the spam that gets through Honeypot is one of a few different of the exact same comment, over and over on different posts.

I could probably cut that spam down by 90% or more just by having a blacklist of strings which, if they are found in the post body, should result in a Honeypot rejection.

Things like "Investments in cryptocurrency" or "great persuasive essays your argument", or one of dozens of other bits I see that nobody in their right mind would type in a comment on my site.

The idea would be adding:

  1. A new 'Blacklist' tab on the Honeypot configuration section in the admin.
  2. A list of 'Blacklist' strings that is stored in Honeypot's configuration.
  3. A new check that only happens if any 'Blacklist' strings exist: if any field contents match any of the blacklist strings, reject the submission.

This could also be an add-on module, like 'Honeypot Blacklist'. I might go that route just to keep this module clean...

Comments

geerlingguy created an issue. See original summary.

geerlingguy’s picture

Another possible option, an addon module that uses a Bayesian filter along with some sort of custom ham/spam dataset, using https://github.com/camspiers/statistical-classifier or a similar PHP library. It could be a bit more complicated than a simple blacklist (if just doing a blacklist, this library looks like a decent-ish start: https://github.com/IQAndreas/php-spam-filter), and I would probably call it something like 'Honeypot Intelligence'.

But honestly, since dropping Mollom this week, my www.jeffgeerling.com spam has increased about 20x (it gets a loooot of traffic and I leave comments open on almost every blog post, forever, so it's a pretty large target for human spammers). I'm evaluating using CleanTalk for now, it looks like it might be a worthy successor to Mollom for a spam-prevention-as-a-service offering.

tr’s picture

Status: Active » Closed (duplicate)

Duplicate of #2878281: Add simple term blacklist functionality. Please contribute to that issue if you want to see this feature added to Honeypot.