Updated: Comment #45
Problem/Motivation
Currently, the Mailhandler Fetcher filter plugins operate based on whether an in-reply-to header is set. If so, they assume the mail is intended to be a comment. If not, the mail is assumed to be a node.
This is problematic because there are many reasons why the in-reply-to header might be set, such as if an email is forwarded, or if an email is sent using Amazon Simple Email. Intent cannot be inferred based on this header alone.
Proposed resolution
The filter plugins should be moved to the Mail Comment module, and Mail Comment should check for a signature in the body or headers to determine whether an email is a node or comment. Additionally, we will also need to implement hook_feeds_importer_default_alter() (sic?) in Mail Comment to override the Mailhandler Default importer to use the Node Filter.
Remaining tasks
See above.
Related Issues
This is not a perfect solution, but it's a step forward. See comment #41 for a scenario where this fails. We should open a new issue for that once this first step is resolved.
Original report by @bakr
Scenario:
- Sample two drupal user account emails are:
- An existing mailbox name is: helpdesk@domain.com (tested to work with pop3 and IMAP)
Feeds importer is configured to authenticate using: "fromaddress"
If John directly sends an email to: helpdesk@domain.com; no issue, the feeds importer will digest that mail and authenticate successfully and create a node of it.
Equally wise, the same applies for Marry.
Now, whereas...
If Marry send an email message to John, and then, John forwards that message to the helpdesk; it does not get authenticated, thus, no node is created.
I think is is an old issue.
| Comment | File | Size | Author |
|---|---|---|---|
| #42 | mailhandler-forwarded-emails-1370286.patch | 1.26 KB | rob230 |
| #6 | mailhandler-debug.patch | 685 bytes | danepowell |
Comments
Comment #1
bakr commentedHave tested the above scenario with both: IMAP and POP3.
Same effect, no luck.
Reference old Issue: #1192516: Why 'Forwarded Messages' Fail Authentication?
We use MS Exchange Server 2010 with SP1
Comment #2
bakr commentedDear Dane,
I understand from the last Issue that my scenario was not reproducible at your mailing environment, which adds a difficulty to trace down the real problem factor.
However, I might hack the module code to insert a few var_dump($variable) statements, that shall reveal what is going under the hood, to help spot the failure point.
If I would do that, what variables would you direct me to diagnose?
Shall print for a you a nicely formatted array representation:
Thanks and Regards
Bakr
Comment #3
bakr commentedHoping this narrow down will help, I have added the following line below:
drupal_set_message("Parser Thinks mail is empty!");
Keep in mind, that currently I have to (legitamte) mail messages waiting to be handled by mailhandler, in the it.helpdesk mailbox.
I can send the php header dump
Comment #4
bakr commentedComment #5
danepowell commentedWhat happens if you edit the Mailbox (not the importer, but the mailbox configuration) and set "From header" to "Sender"?
It would help troubleshoot if you could apply the attached patch, and submit the two messages from the dblog after an import.
Anecdotally, I've seen similar failure modes (empty messages) if the headers of an mbox are malformed. Presumably this could happen for IMAP/POP3 connections as well.
Comment #6
danepowell commentedForgot the patch- also, just realized this is for 7.x, but you should be able to apply the changes manually if need be. Very simple.
Comment #7
bakr commentedWatchdog returned empty variable strings.
Alternativaly,
I have produced my own header printout, below you will find the headers of two messages,
- the first one, was sent directly to the mailhanlder.
- the other one(was forwarded), still resides in the mail box, and gets unoticed by mailhanlder.
The ration here is to give your wisdom a glimpse into what is happening, such you might catch the malformed spotty area
Comment #8
bakr commentedThe following is the script I used to generate the above
Comment #9
bakr commentedI have tried setting "From header" to
- "Sender"
- "From"
-
no luck in three cases so far.
Comment #10
bakr commentedBy the way..
Even inserting the:
, proved that it was not even invoked.
the watchdog table was empty.
This indicates the logic is not reaching even here.
But for sure it got invoked with importing a normal direct message. that is, the watchdog entries where created normally.
Comment #11
bakr commentedBy the way, I use the mimemail module, does it affect?
To your knowledge, Is it different from phpmailer?
Comment #12
bakr commentedSo, if the $header (as per the patch) returned nothing in the watchdog table; What could be the deeper further step to dig for trouble within the module?
Keep in mind that the ajax test script for connectivity with the mailbox successfully identifies some messages as residing in the mailbox.
somewhere in between their is a "certain" extra exception handling that needs to be taken care to make thing less fragile, and thus eliminate the anecdotal behavior.
Best Regards
Bakr
Comment #13
danepowell commentedI have a suspicion... what happens if you go to admin/build/feeds, edit your importer, edit the fetcher, and change the filter from 'nodes' to 'all'?
The full path for that configuration page is $base_url/admin/build/feeds/edit/[importer_name]/settings/MailhandlerFetcher, where [importer_name] is the machine name of your importer.
Comment #14
bakr commentedAha,
After setting the fetcher setting so filter "All", then trying to re-import, the result is the following error.
An error has occurred.
Please continue to the error page
An error occurred. /hd/?q=batch&id=217&op=do
Fatal error: Cannot instantiate abstract class MailhandlerFilters in F:\htdocs\hd\sites\all\modules\mailhandler\mailhandler.module on line 168
Import simply fails.
Hmmm!
Comment #15
bakr commentedthe $class variable was pointing to "MailhandlerPhpImapRetrieve"
Now we need to inspect that class file "MailhandlerPhpImapRetrieve.class.php"
I shall inject some var_dumps...
I think we are getting somewhere with this...
Comment #16
bakr commentedOk... looking in the problematic class...
I noticed that in-between I had to restart apache, as php was claiming out of memory... there is a leak here, eventhough I have more than 1GB allocated for PHP - can you imagine!
anyway....
the $new variable dumps the following:
array(4) { [0]=> int(1) [1]=> int(2) [2]=> int(3) [3]=> int(4) }
So ... stepping next ....
Comment #17
bakr commentedThe following var_dumps where injected (see +++)
and the result error message which appears during import was as such:
So, no surprise, the failing line is:
$message = $this->retrieve_message($result, $mailbox, array_shift($new), $encoding, $filter_name);
Comment #18
bakr commentedNow ...
going to dig some var_dumps into the following function, to see what can be revealed..
And i think, that because the above function fail unsafely some how, this may elaborate why the upper calls was failing to be instantiated.
hmmm..
Comment #19
bakr commentedAll recent findings indicate that there is some hidden limitation in the filters handling mechanism.
As you previously recall, that my inbox was having two stale (forwarded) messages. the whole assumption is that they should be digested by the mail handler and cleared from mail server.
Now, something interesting happened, one of them got through, (while I was fiddling with the code), the other did not.
Now I get this error which further leads to the sneaky trouble area, or call it the root cause:
Now we are pointing fingers to:
class MailhandlerFilters
Comment #20
bakr commentedAt last it works (Now mailhandler is able to import every message in the mailbox normally, regardless it was forwarded or directly sent), but after commenting the $filter (see below)
Ration is, this is indeed in need of some review of the filtering.
Ofcourse this is not a solution, but a lead.
The filtering was creating all the issue in the up-chain of mailhandler funtion.
Comment #21
bakr commentedComment #23
bakr commentedFindings Summary
============
The following function is the root cause of trouble, yet I do not know the root of the root!
Comment #25
danepowell commentedCan you download 2.3? The error related to MailhandlerFilter was fixed there (along with many other bugs)... or you can manually apply the fix at #1369672: Error when using 'All' fetcher filter if you just can't upgrade.
Comment #26
danepowell commentedAnyway, assuming upgrading fixes that error, the real issue is that the Node filter is rejecting the forwarded emails due to the presence of the in-reply-to header.
Comment #27
bakr commentedI agree with #26.
That is what is happening.
and yes true the recent fix was a step forward but closing another parallel issue.
I do not know why does the filter reject ... if only i can get to know this.
Also, the filter, as a function is doing great, can't go without it... if it wasn't for this issue.
Comment #28
danepowell commentedHmm, this is turning into a pretty kettle of fish... the original goal of the filters was to separate replies from novel emails, assuming that the novel ones would be 'nodes' and the replies would be 'comments'. However, I now realize that this assumption isn't necessarily safe- for instance, someone could reply to a notification of a node post in order to edit that post. Conversely, someone could compose a fresh email with a cid in the commands in order to post a comment. Granted, these are probably edge cases, but I think they indicate an architectural problem.
Separating replies from non-replies is hard enough, but I'm not really sure how to figure out whether the user is trying to post a node or a comment without forcing them to explicitly specify.
Comment #29
danepowell commentedOkay... I think the way to handle this is to run filters after the parser. The condition for a comment will be
isset($pid) && isset($nid). The condition for a node will be!isset($pid). Note that the pid/nid could be set by mailcomment, or by a command from the user.Comment #30
danepowell commentedBah... #29 won't work either- an importer with the Feeds Node Processor would import intended comments as nodes.
I'm not sure if this will get fixed in 6.x-2.4- I just can't figure out a way to solve it that's not hackish... if anyone has suggestions, I'd welcome them.
Comment #31
bakr commentedmaybe as a short solution, would be to add a checkbox to be added in importer configuration screen, so that to bypass filtering is a message is detected being a "forwarded" message
Am not sure how far this can help.
Comment #32
danepowell commentedMarked #1408512: Better node/comment message filters as duplicate. The suggestion there, that I like, is to look for both the in-reply-to header AND a Mail Comment server string. It's still not architecturally ideal, but I think it's better than the current system.
(The reason this isn't 'ideal' is that, theoretically, you could try to post a comment by manually specifying the node id / parent id using the commands system. But this would get classified as a node by the filter proposed here. A strange workflow to be sure, but not completely inconceivable.)
Comment #33
obrienmd commentedWouldn't the current filters also be inappropriate for the theoretical situation, as they would run before the commands are processed and not catch that as a comment?
I have the capability to write patches here (I think), but I see two ways forward and am interested in which way you'd like me to proceed:
1) Move this functionality over to mailcomment (preferred IMHO)
2) Add a submodule to mailhandler which provides this, and requires mailcomment
Or am I off base in thinking these are the only two options with my current proposal (which I agree is subpar but gets us further than the current solution)?
Comment #34
bakr commentedMy voice goes to option 1: the mailcomment.
This would segregate the issue into workable divide-and-conquer approach.
Comment #35
danepowell commentedI also think that both of these filters should go in the Mail Comment module, since they now rely on extracting the Mail Comment message signature.
We will also need to implement hook_feeds_importer_default_alter() (sic?) in Mail Comment to override the Mailhandler Default importer to use the Node Filter.
Comment #36
obrienmd commentedHrm, well perhaps I need to stop trying to convince myself I'm a decent developer. I failed miserably at this last night / this morning. The idea's mature, just needs to be
Note: A simple hack of the MailhandlerFiltersNodes and MailhandlerFiltersComments classes to the effect above did a great job of solving our issue in my limited testing.
Comment #37
obrienmd commentedHere is my hackified fetch function from MailhandlerFiltersComments (FiltersNodes is vise-versa on the TRUE and FALSE):
Comment #39
Todd Young commentedI am (was) experiencing the same thing and I think I changed the filter to "all" (among other random things) and I think the problem went away... Are we fixed on this one? Am I ok?
Thanks for great work, y'all...
Comment #40
1richards commentedHad the same problem and switching Mailhandler Fetcher message filter to "All" resolved this for me.
For reference, I am using a gmail account and Feeds 7.x-2.0-alpha5 and Mailhandler 7.x-2.5
thanks all
Comment #41
MixologicIt doesn't seem like a solid architecture to attempt to extrapolate whether or not an email is intended to become either a node or a comment based upon whether or not a general purpose SMTP header is set. There are myriad reasons beyond the use cases here that would cause the 'in-reply-to' header to be set.
In my case, Im using the Amazon Simple Email Service and it sets the in-reply-to header on every email it sends. Im sure other email forwarding services do this as well. So checking if the in-reply-to header in MailHandlerFiltersNodes.class.php seems like its bound to only work in certain use cases and not other.
Additionally, a reply isnt necessarily restricted to creating comments in all use cases either. We're building a Q&A style site, where we have the original question, and answers, and comments on both questions and answers. Replies to the the original question will create Answers, whereas replies to the Answers will create comments.
I think the only way to determine the intent is by setting specific commands in the emails, and not relying on the in-reply-to header to contain any sort of intent.
Comment #42
rob230 commentedI think setting it to 'All' is more of a workaround (avoiding the issue) rather than a solution. The method in #37 makes sense to me so I've tried it and it works. Patch for 7.x-2.x attached for anyone that needs it.
It's not a satisfactory solution though because it assumes the presence of Mail Comment. If you don't have Mail Comment, setting the import filter to 'All' will probably fix it for you since you aren't expecting comments anyway. I can't think of a nice way that would work in both cases.
Comment #43
kscheirerHmm, hard to say where we are with this.
I suppose the preg_match patch helps, but only in certain cases.
Setting the import filter to All works in some cases as well.
Neither provide a solid solution.
Is #35 the preferred approach? In which case we should move this issue to Mail Comment, write a patch against that to provide the 2 filters and hook_feeds_importer_default_alter to change the filter provided by Mailhandler?
#41 seems to indicate that any method using in_reply_to will fail in some cases.
That sounds reasonable. Could that command be added to the default commands, and thereby be easy to disable/configure?
If someone can provide me with a good approach and outline the changes, I'm happy to write the patch!
Comment #45
danepowell commentedI updated the issue summary to include what I see as the path forward here. Let's implement the solution I proposed in #35 as a solution that should cover the vast majority of use cases. Then we can open a new issue to cover edge cases such as #41.
Comment #46
danepowell commentedComment #47
maedi commentedAs mentioned In-Reply-To can be set even when the email is not a genuine reply. The original Message-ID also moves from In-Reply-To to the References header in longer email chains.
I think the logic should be to just see if the email has a valid Message-ID in the 'reply headers'. Here's a filtering solution based on a patch I've created for Mail Comment: #2852329: Patch: Better parse Message-ID
The regex could be generalised to work with without mail comment (the string simplified and a check that the domain is the originating site). Or as mentioned the filter could be moved to Mail Comment. If the other patch gets through then we can reference the regex variable from Mail Comment.