Hi

I've been using OGM for a year or so now and am quite delighted with it :)

I notice that occasionally I get multiple emails when a user does a Reply-to-all. I've tested and can repro this.

Scenario:
-- User is in several (let's say 3) OG groups. I am also in these same groups
-- Node is generated and placed in all of the user's OG groups
-- User receives a single email, but the email's "to" list contains all of the OGM email addresses for the user's OG groups
-- User does a "reply to all"
-- I receive 3 identical emails from OGM containing the user's reply
-- I also note the comment has been added to the node 3 times

This happens only when the user has received the email on a few email systems. AOL is my big offender, but i've also had issues with comcast.net and some company-specific email systems. This doesn't happen when a reply-to-all is done on gmail, live.com/hotmail, or my exchange server at work.

Looking at my mailgun logs, I see these differences:
Reply-to-all from Exchange:
routed: 3 of these entries, all from exchange user's email address. Each lists all 3 ogm addresses in message | recipients. Each list a different one of the 3 OGM addresses in "recipient"
accepted: only one of these, from exchange user's email address, and listing all 3 ogm addresses in message | recipients. "Recipient" is mailgun_callback_mime
posted: only one of these, from exchange user's email address, and listing all 3 ogm addresses in message | recipients. "Recipient" is mailgun_callback_mime

Reply-to-all from AOL:
routed: 3 of these entries, all from aol user's email address. Each lists all a single different ogm addresses in message | recipients, and the same address in "recipient"
accepted: 3 of these entries, all from aol user's email address. Each lists all a single different ogm addresses in message | recipients. "Recipient' is mailgun_callback_mime
posted: 3 of these entries, all from aol user's email address. Each lists all a single different ogm addresses in message | recipients. "Recipient' is mailgun_callback_mime

I've looked at the message headers (shown below, as received by mailgun) in the AOL response, and they're basically the same as from an Exchange response The only elements that differ between the two versions are in the message | headers section (seems just formatting), and the message | recipients section, where AOL lists one address and Exchange lists 3. Not sure if there is some way to adjust the headers so AOL sends only one email instead of 3, or if AOL just sucks.

Any advice?

Thanks

AOL Headers:

  "id": "A3EGO7u4Sg6GiCja1P25Zw",
  "envelope": {
    "targets": "posting-queue1@example.com",
    "transport": "local",
    "sender": "{user}@aol.com"
  },
  "recipient-domain": "example.com",
  "method": "smtp",
  "campaigns": [],
  "user-variables": {},
  "flags": {
    "is-routed": null,
    "is-authenticated": false,
    "is-system-test": false,
    "is-test-mode": false
  },
  "log-level": "info",
  "routes": [
    {
      "priority": 1,
      "expression": "catch_all()",
      "description": "",
      "actions": [
        "forward('http://example.com/og_mailinglist/mailgun_callback_mime')"
      ]
    }
  ],
"message": {
    "headers": {
      "to": "mailhandlers@example.com, posting-queue1@example.com, \tz-test-public-group-in-grs@example.com",
      "message-id": "14eae05b85b-2d97-37d6@webstg-m01.mail.aol.com",
      "from": "{user}@aol.com",
      "subject": "Re: [mailhandlers][2 other groups] another post to test aol"
    },
    "attachments": [],
    "recipients": [
      "posting-queue1@example.com"
    ],
    "size": 4304
  },
  "recipient": "posting-queue1@example.com",
  "event": "accepted"

Exchange headers:

  "id": "eD7mH2SRSpisvgrgqtfItA",
  "envelope": {
    "targets": "posting-queue1@examle.com",
    "transport": "local",
    "sender": "{user1}@exchangeserver.com"
  },
  "recipient-domain": "example.com",
  "method": "smtp",
  "campaigns": [],
  "user-variables": {},
  "flags": {
    "is-routed": null,
    "is-authenticated": false,
    "is-system-test": false,
    "is-test-mode": false
  },
  "log-level": "info",
  "routes": [
    {
      "priority": 1,
      "expression": "catch_all()",
      "description": "",
      "actions": [
        "forward('http://example.com/og_mailinglist/mailgun_callback_mime')"
      ]
    }
  ],
  "message": {
    "headers": {
      "to": "\"mailhandlers@example.com\"\t<mailhandlers@example.com>,\t\"posting-queue1@example.com\"\t<posting-queue1@example.com>,\t\"z-test-public-group-in-grs@example.com\"\t<z-test-public-group-in-grs@example.com>",
      "message-id": "66A8193CA58FAC47912322CFD138FC8718EA4C47D8@exchangeserver.com",
      "from": "\"{user1 name}\" <{user1}@exchangeserver.com>",
      "subject": "RE: [mailhandlers][2 other groups] another post to test aol"
    },
    "attachments": [],
    "recipients": [
      "mailhandlers@example.com",
      "z-test-public-group-in-grs@example.com",
      "posting-queue1@example.com"
    ],
    "size": 2803
  },
  "recipient": "posting-queue1@example.com",

Comments

hanksterr7’s picture

Issue summary: View changes
hanksterr7’s picture

Issue summary: View changes
mahfiaz’s picture

OGM relies on message-ID being the same. The message id-s are saved into database and when the same message (judged by ID) hits OGM code it is just silently dropped. I think MTAs also drop emails with identical message-IDs, so OGM code often won't see these at all (happens in Exchange case)

I suppose AOL does not set an message-ID to the email headers and that field will be filled when the message has left the first MTA point (which sends out several messages and in AOL case with different message-IDs).

If we would want to work-around that, then we should check if the we already have a mail which matches the same sender, recipients and date. The backlog could be as short as 30 minutes, emails rarely travel longer than that. Also it would be possible to double-check only mail from graylisted domains or skip this test for mail from whitelisted domains.

hanksterr7’s picture

Interesting. Which message ID are you referring to? In my headers shown above, there is an ID at the top, and then message-id inside message | headers. I assume you mean the one inside message | headers

For a reply-to-all done from an Exchange user, mailgun shows 3 "routed" entries, but turns this into only one Accepted and one Posted entry. The three "routed" entries each have the same message | headers value (and the "recipients" section of this lists all 3 ogm addresses). The "recipient" field (outside message | headers) is different in each, and is one of the three OGM addresses. The timestamp (also outside message | headers) is also different in each, and the ID field (also outside message | headers) is different in each (the message-id inside message | headers is the same in each). There is no timestamp anywhere else, so I'm not sure how to tell that these three should actually be considered a single message, unless we look only at sender and message | headers | message-id. But that would get messed up if someone sends multiple emails in response to the email they received. Trying to use a time window, saying that if multiple emails arrive in a short time window, where each has the same sender and message | headers | message-id value could be tricky.

In the AOL case, I also get 3 "routed" entries, but mailgun then logs 3 accepted and posted entries. The difference in the headers is that message | headers | recipients shows only one ogm address in each of the 3 messages, so I guess mailgun doesn't know to combine the messages. Even in the AOL case, message | headers | message-id is the same in the 3 messages.

hanksterr7’s picture

The code in _og_mailinglist_process_email that tries to detect duplicates is not helping since the three response emails arrive practically at the same time and are being processed by different threads on the server. As such, when each thread starts, the other threads have not yet created a row in og_mailinglist_source :(

Any ideas for how this processing could be serialized?

    // Let's double check if this message has been here.
    // https://drupal.org/node/2181049
    $result = db_query('SELECT ogms.nid FROM {og_mailinglist_source} ogms
        WHERE ogms.message_id = :msgid',
        array(':msgid' => $email['headers']['message-id']));
    if ($result->rowCount() > 0) {
watchdog('og_mailinglist', 'found message-id already exists in ogms. aborting incoming message', array(), WATCHDOG_INFO);
        exit();
    }
    else {
watchdog('og_mailinglist', 'message-id not found in ogms. continuing.', array(), WATCHDOG_INFO);
    }
hanksterr7’s picture

Hi

I have a solution that seems to work. Not in a patch yet. Let me know what you think of this idea.

Mod is to _og_mailinglist_process_email() in og_mailinglist_transport.inc

The problem is that when a reply-to-all is done in AOL, AOL generates an email for each address listed in the recipients field of the headers. The headers in the reply have a recipients array containing a single address, which is the same as the value of "recipient" (and as such, each of the emails has a different value of recipients and recipient fields, with the same value being in recipients and recipient in each email).

When MailGun receives these replies, it can't tell that they are the same message, so it fowards each on to mailgun_callback_mime

Reply-to-all's from Exchange are different. Exchange also generates an email for each address in recipients, but the reply emails list all addresses in the recipients field and only one of the addresses in the recipient field of each address. So, MailGun can tell that the replies are the same since from, to, subject, message-id and recipients are all the same across all the replies. MailGun forwards only one of the replies on to mailgun_callback_mime and swallows the rest. (It seems MailGun looks at recipients field when trying to determine if two messages are the same. The "to" header is actually the same across these emails (containing all recipients), but "recipients" contains only a single address in each email)

The current check in _og_mailinglist_process_email() that looks for message-id in og_mailinglist_source is not adequate. The row doesn't get added to og_mailinglist_source until after the node is created for the comment. By that time, MailGun would have already fowarded on to _og_mailinglist_process_email() another of the duplicate replies

So instead, I do a check immediately in _og_mailinglist_process_email(), and write a row immediately to a new table, ogm_lock, if a row with the same headers has not already been found in ogm_lock within 5 seconds of the time the current message is being processed (the 5 second limit could probably be extended to maybe 30 seconds. Unlikely a user would ever do more than one reply to the same email within 30 seconds). This allows a user to do multiple replies to the same email, as long as they are some # of seconds apart. For replies to the same email that come in too fast, they would be considered duplicates of one another and only one would be allowed to continue.

Code is below. Obviously not standard drupal (I'm new at this), but it is working for me

Thanks

ogm_lock table columns:
headers longtext
timestamp char(15)
rowid int(11) PK auto_increment

Need an index on headers

    if (!empty($email['headers']['x-beenthere'])
      or empty($email['headers']['message-id'])) {
      exit();
    }

    list($usec, $sec) = explode(" ", microtime());
    $e = ((float)$usec + (float)$sec);

    $h = '';
    $t = $email['from'];
    foreach ($t as $a => $b) $h = $h . ' ' . $a . ' ' . $b;
    $t = $email['to'];
    foreach ($t as $a => $b) $h = $h . ' ' . $a . ' ' . $b;
    $h = $h . ' ' . $email['headers']['subject'] . ' ' . $email['headers']['message-id'];

    $result = db_query('select * from ogm_lock where headers = \'' . $h . '\' and ' . $e . ' - timestamp < 5.0', array());
    if ($result->rowCount() > 0) {
       watchdog('og_mailinglist', 'message has already been processed. aborting', array(), WATCHDOG_INFO);
       exit();
    }
    else
       watchdog('og_mailinglist', 'message not found in ogm_lock. continuing', array(), WATCHDOG_INFO);
    $result = db_query('insert ogm_lock (headers, timestamp) values (\'' . $h . '\',\'' . $e . '\')', array());



    // Let's double check if this message has been here.
mahfiaz’s picture

There is one possible problem. Sometimes the sender only is subscribed to one group of several. In that case if the first e-mail which hits OGM, is sent to group where user has no permission to post, then it might get rejected (I would have to check the code, I am not really sure how it would behave).

hanksterr7’s picture

Interesting question. I don't think there is any issue.

Say user A is in 3 groups and creates a post in all three groups
Another user B is in 2 of the 3 groups
User B gets an email that has been sent to the 3 group email addresses.
User B does a reply to all. If user B is using Exchange or gmail, their mail service generates three emails, and each email lists all three OGM addresses in both "To" and "Recipients", and one of the 3 OGM addresses in "Recipient"
Mailgun swallows two of the three emails it received (since they had the same "Recipients" and "message-id") and sends only the first to mailgun-callback-mime
OGM looks at the incoming email, finds the user from "From", and the groups from "To" and "CC" header fields. It finds the user to be a member of at least one of the groups listed in "To" and "CC", so the email is processed. The "Recipient" field is the only one that does not list all three OGM addresses, but this is not relevant to the "is the user a member of the groups" question.

If User B is using AOL, AOL generates 3 emails, with each listing all 3 OGM addresses in "To" field, and only one of the three addresses in "Recipients" and "Recipient" (with each email listing a different address in Recipients and Recipient, but the same addresses in Recipients and Recipient in each email).
Mailgun thinks the three emails are different and sends all three to mailgun-callback-mime.
OGM receives all three emails and (with my patch) discards the 2nd and 3rd (since all three list the same "to", "from", "subject" and "message-id")
OGM looks at "To" and "Cc" and decides user is a member of at least one of the three groups listed in "To", so it processes the message.

All of this works because OGM is looking at "To" and "Cc", and not Recipients or Recipient fields, in deciding if user is a member of the groups.

So life is good :)