Review and fix antispam measures targeting unconfirmed users [#2993619]

Problem/Motivation

When an unconfirmed user tries to post something, they are sometimes encountering a flood control measure that produces the message:

There was a problem with your form submission. Please wait X seconds and try again.

X is some integer.

In some cases X, seems excessive. In this one: #2973445: Request for 'confirmed' user role for Mirelyj, the user says the message is: "Please wait 2592000 seconds and try again" (2592000 seconds is about one month), and in this one the user says: "X could be a year in seconds". This is so excessive that I consider it a bug.

Also, the phrase, "There was a problem with your form submission." is too vague and not helpful in making the user understand the problem.

In addition, there has been several instances where non-spam postings by unconfirmed users has been automatically unpublished. For reports about this, see, for instance #2993233: Disappeared issue and #2991811: Request for 'confirmed' user role.

Since these are posting that are clearly not spam (they don't even contain links), it seems that our spam-detection algorithm is producing too many false positives..

Proposed resolution

Put a cap on the waiting period that unconfirmed users must suffer (IMHO, 600 seconds - 10 minutes - should be sufficient).
Replace the phrase "There was a problem with your form submission" with something that explains what the problem is. My suggestion: "As an unconfirmed user on Drupal.org, you are not allowed to post too frequently".
Improve the algorithm that unpublishes assumed "spam" to reduce false positives.

Remaining tasks

Do it.

User interface changes

To be decided.

API changes

None (I think).

Data model changes

To be decided.

Comments

Comment #1

19 August 2018 at 03:52

gisle created an issue. See original summary.

Comment #2

gisle

he/him

Norwegian Bokmål

Norway

CreditAttribution: gisle at Hannemyr Nye Medier AS commented 19 August 2018 at 03:56

Issue summary:

View changes

Comment #3

gisle

he/him

Norwegian Bokmål

Norway

CreditAttribution: gisle at Hannemyr Nye Medier AS commented 19 August 2018 at 06:10

Issue summary:

View changes

Comment #4

drumm

he/him

NY, US

CreditAttribution: drumm at Drupal Association commented 20 August 2018 at 19:59

Assigned:	Unassigned	» drumm
Related issues:		+#2993892: Do not report more waiting time than the expire time

Put a cap on the waiting period that unconfirmed users must suffer (IMHO, 600 seconds - 10 minutes - should be sufficient).

There actually already is an effective cap, but the UI doesn’t show it. I opened #2993892: Do not report more waiting time than the expire time to help with this, but drupalorg_honeypot’s additional time will need a fix too.

Comment #5

21 August 2018 at 15:27

drumm committed 4a27af0 on 7.x-3.x, dev

Issue #2993619: Clean up anti-spam messaging

drumm committed e3f0f68 on 7.x-3.x, dev

Issue #2993619: Don’t add more time than the honeypot expiration window

Comment #6

drumm

he/him

NY, US

CreditAttribution: drumm at Drupal Association commented 21 August 2018 at 15:37

Put a cap on the waiting period that unconfirmed users must suffer (IMHO, 600 seconds - 10 minutes - should be sufficient).

Our custom code that adds extra penalties for spammy characters and phrases was also effectively capped by the honeypot expiration configuration. This is now correctly reported to the user too.

I’ve also reduced the drupalorg_honeypot_factor multiplier to ramp up time limits more slowly. If we see spam coming in more-quickly, we may need to adjust this back up.

Replace the phrase "There was a problem with your form submission" with something that explains what the problem is. My suggestion: "As an unconfirmed user on Drupal.org, you are not allowed to post too frequently".

This message comes from the honeypot module and would be cumbersome to override. We do have our own message in addition to honeypot’s, which I’ve updated to be more accurate 4a27af0. (We do not currently have anything which automatically grants the confirmed role.) I also found that it might not be displayed alongside Honeypot’s messaging, which is now fixed.

Comment #7

drumm

he/him

NY, US

CreditAttribution: drumm at Drupal Association commented 21 August 2018 at 15:53

Status:

Active

» Fixed

Improve the algorithm that unpublishes assumed "spam" to reduce false positives.

That algorithm is actually Akismet, which is now fulfilling the role Mollom had. Unlike Mollom, Akismet does not have an “unsure” classification or other way to adjust the sensitivity. All we can really do is make the trade-off of either treating their “spam” classification as spam, or not. We did have this misconfigured on some forms for some time, and saw plenty of spam. (They also have a “pervasive spam” classification https://blog.akismet.com/2014/04/23/theres-a-ninja-in-your-akismet/ which we always had enabled while the service was active.)

Comment #8

gisle

he/him

Norwegian Bokmål

Norway

CreditAttribution: gisle at Hannemyr Nye Medier AS commented 21 August 2018 at 17:08

That algorithm is actually Akismet

Hmmm ...

Here is an example of a comment that Akismet classified as "spam":

https://www.drupal.org/project/drupal/issues/2967585#comment-12730792

While I understand that we need tools like this to protect ourselves against spam, perhaps the onboarding process should warn new users that call traces are not welcome here?

Comment #9

gisle

he/him

Norwegian Bokmål

Norway

CreditAttribution: gisle at Hannemyr Nye Medier AS commented 22 August 2018 at 04:27

Related issues:

+#2994176: Request for 'confirmed' role

Here is another one: #2994176: Request for 'confirmed' role.

Comment #10

gisle

he/him

Norwegian Bokmål

Norway

CreditAttribution: gisle at Hannemyr Nye Medier AS commented 22 August 2018 at 15:10

Related issues:

+#2994350: Request for 'confirmed' role

Another one #2994350: Request for 'confirmed' role.

Comment #11

hestenet

He/Him

Portland, OR 🇺🇸

CreditAttribution: hestenet at Drupal Association commented 22 August 2018 at 15:16

Thanks for following up on these examples @gisle.

Our republish process should be reporting false positives back to Akismet's algorithm the same way that our mark as spam option reports those patterns to their algorithm.

We may want to contact them directly about some examples to see what they can tune - unfortunately we don't get many tuning options with this service.

Unfortunately this is all a balance between how much spam is blocked, how much gets through, and how many false positives result in real users being blocked. There'll always be a few - but we should make sure they know they can always email support for manual review.