The comment spam box of the ssf_comment submodule
The comment spam configuration form.

This project is not covered by Drupal’s security advisory policy.

The Statistical Spam Filter (SSF) uses naïve Bayes classifiers as a machine learning technique for automatic text classification. This technique can be used to filter out or block spam content.

SSF comes with a submodule that makes use of the service provided by SSF to prevent comment spam (see below).

The service provided by SSF can also be used by other modules to create their own spam prevention functionality.

SSF is based upon the work by Tobias Leupold: b8 - A statistical (“bayesian”) spam filter implemented in PHP.

Comparable projects

Spam

Further reading

b8: statistical discussion - Tobias Leupold - Februari 22, 2010
A Statistical Approach to the Spam Problem - Gary Robinson - March 1, 2003
Better Bayesian Filtering - Paul Graham - Januari, 2003
Spam Detection - Gary Robinson - September 16, 2002

Statistical Spam Filter for Comments

"Statistical Spam Filter for Comments" is a fully functional submodule of the "Statistical Spam Filter" module that is created to demonstrate an implementation of the SSF module.

This submodule is able to recognize comment spam, and places spam in a seperate spam folder.

When the spam filter is uncertain whether a comment is spam or ham, it will place the comment in the Unapproved comments folder.

An administrator can move Published comments to the spam folder and move
false positives to the published folder.

The decision by the administrator to move Unapproved comments either to the published or spam folder will allow the spam filter to learn the difference between spam and ham more accurately. Correcting false positives and false negatives will trigger automatic re-learning of spam and ham by the filter.

After a short learning period the spam filter will become very effective in blocking spam and recognizing valid contributions, and human intervention required to prevent spam will be reduced to a minimum.

Project information

Releases