Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
As soon as your Drupal site name is composed of non-ASCII characters, the "From" header is causing trouble.
Gmail for instance, will systematically take it as spam.
Comment | File | Size | Author |
---|---|---|---|
#79 | 2717965-2-79.patch | 4.64 KB | alexpott |
#79 | 77-79-interdiff.txt | 1.15 KB | alexpott |
#77 | site_name_is_not_utf_8-2717965-77.patch | 4.5 KB | yogeshmpawar |
#72 | 2717965-72.patch | 4.53 KB | alexpott |
#72 | 62-72-interdiff.txt | 3.48 KB | alexpott |
Comments
Comment #2
pguillard CreditAttribution: pguillard commentedHe is a first patch suggestion. Any thoughts ?
At the same time, I guess -and hope- this is not a duplicate from this one : https://www.drupal.org/node/2223967
Comment #3
pguillard CreditAttribution: pguillard commentedAfter patch :
Comment #4
jibranNice catch! Can we have some tests please?
Comment #5
pguillard CreditAttribution: pguillard commented@jibran : Do you mean I should create a unit test in Drupal\Tests\Core\Mail ?
Comment #6
pguillard CreditAttribution: pguillard commentedComment #7
pguillard CreditAttribution: pguillard commentedIt was assigned to the wrong component.
Comment #8
pguillard CreditAttribution: pguillard commentedWith tests
Comment #9
oxy86 CreditAttribution: oxy86 commentedSame problem here. In my D8 installations, when we are using greek characters in "Site Name" (/admin/config/system/site-information) all system emails are flagged as spam by Gmail. For instance, in the site www.asfaleia-insurance.org which has the site-name "ΑΣΦΑΛΕΙΑ-INSURANCE", system emails are being treated as spam by Gmail (screenshot attached). In fact, I think ALL drupal8 sites with Greek names will experience the same problem. That problem did not exist in Drupal7 or Drupal6. The Unicode::mimeHeaderEncode patch in #2 resolves the issue.
Comment #10
jibranThis is ready.
Comment #12
pguillard CreditAttribution: pguillard commentedReroll
Comment #14
pguillard CreditAttribution: pguillard commentedHum, let's see how it goes on 8.2.x-dev branch...
Comment #16
oushen CreditAttribution: oushen as a volunteer commentedComment #17
oushen CreditAttribution: oushen as a volunteer commentedComment #18
cilefen CreditAttribution: cilefen as a volunteer commentedI can probably go with Major priority on this one but it is not critical by our definitions.
I like imports to be in alphabetical order.
Same.
What does the "W" mean in this function name?
There is a typo in the variable name and the string concat operator must be surrounded by a single space.
This variable is unused.
The last method in a class must have a blank line after it.
Do not use the same function to encode the string. Use the encoded string instead. That's a better test.
Comment #19
cilefen CreditAttribution: cilefen as a volunteer commentedAlso, I think that we do not need another test method for this. A tiny modification of testFromAndReplyToHeader() would do.
Comment #20
yogeshmpawarI have rerolled the patch as per comment #18 & #19
Comment #22
yogeshmpawarFixed the test fails.
Comment #23
cilefen CreditAttribution: cilefen as a volunteer commented#18-4 and the last unnumbered comment in #18 still apply:
Comment #24
yogeshmpawarThanks @cilefen, made changes as per your suggestion #23.
Comment #25
cilefen CreditAttribution: cilefen as a volunteer commentedThis variable name is still mistyped. Plus, since it is used only once, it doesn't need to exist. The string value can be used in the assertEqual.
Comment #26
yogeshmpawarThanks @cilefen, made changes as per your suggestion #25
Comment #27
pguillard CreditAttribution: pguillard commentedAs my own patch goes way back now, I guess I can set this one to RTBC !
Comment #28
cilefen CreditAttribution: cilefen as a volunteer commentedComment #29
alexpottI'm confused because \Drupal\Core\Mail\Plugin\Mail\PhpMail::mail() encodes all the headers.
Looking at the email D8 sends without the patch it is mime-encoded....
So what I think is the problem is that the email is mime encoded too.
The fix needs to be applied in \Drupal\Core\Mail\Plugin\Mail\PhpMail::mail to do something special for 'from' to extract the final things between < and > and then mime encode the rest.
Comment #30
alexpottHmmm re #29...
Is in \Drupal\Core\Mail\MailManager::doMail() so there is some expectation about mime in the mail manager - but we the patch attached we'll be calling mimeHeaderEncode twice on the same string... does that matter?
Comment #31
cilefen CreditAttribution: cilefen as a volunteer commentedComment #32
hampercm CreditAttribution: hampercm as a volunteer and at Acquia commentedWorking on this at SprintWeek 2017 in Boston...
Comment #34
pguillard CreditAttribution: pguillard commentedJust a patch re-roll
Comment #35
gaurav.kapoor CreditAttribution: gaurav.kapoor at OpenSense Labs commentedComment #36
pguillard CreditAttribution: pguillard commentedComment #37
hampercm CreditAttribution: hampercm as a volunteer and at Acquia commentedComment #38
Liam MorlandI independently came up with the same solution for this myself and I have tested the patch in #34.
Comment #39
alexpott#29 still hasn't been address. I don't think MailManager should be doing mimeEncoding. This should be the responsibility of \Drupal\Core\Mail\Plugin\Mail\PhpMail::mail() - which is atm mime encoding it is just doing it incorrectly in the case of the from header. And probably other headers.
Comment #40
Liam MorlandFixing it in PhpMail::mail() means we have to parse the from header to separate out the email address from the rest of it. While this is not particularly hard to do, it is more awkward then assembling things correctly from the start.
Comment #41
alexpottYeah this is really tricky to see where the responsibility lies. All the mime encoding atm is happening in \Drupal\Core\Mail\Plugin\Mail\PhpMail::mail() but \Drupal\Core\Mail\MailManager::doMail() sets the mime version.
Looking at all the options I think this is probably the best option but I do think it is worth...
... adding some commentary to the above code about why this is mime encoded here, and we aren't just letting PhpMail do it's thing. I've tried to read all the email RFCs and I think the important thing is this. The From field is specified here: https://tools.ietf.org/html/rfc2822#section-3.4 - if we mime encode the whole field we break the the format of the field being
[display-name] angle-addr
where angle-addr is[CFWS] "<" addr-spec ">" [CFWS] / obs-angle-addr
and addr-spec is https://tools.ietf.org/html/rfc2822#section-3.4.1tldr; mime encoding the entire from header means that the email address is not encoded correctly.
So back to needs work for improving/adding code comments.
Comment #42
Liam MorlandPatch #34 with comments. Rolled against 8.4.x.
Comment #43
yogeshmpawarAny Update on this issue ?
Comment #44
cilefen CreditAttribution: cilefen as a volunteer commentedIt needs a review done.
Comment #46
cilefen CreditAttribution: cilefen as a volunteer commentedI am adding credit to balagan for identifying and closing a duplicate at the Baltimore triage.
Comment #47
mh35 CreditAttribution: mh35 commentedComment #48
Liam MorlandThe other item is a duplicate of this. What we need is a review of the current patch.
Comment #49
Anthony Fok CreditAttribution: Anthony Fok as a volunteer commentedBased on my testing, the current patch is inadequate, namely:
For short site name, hence short email display name, such as "Drépal" or "Fátima Maria", the patch correctly generates the following From header:
which gets rendered to:
But once we have something long, the
=?UTF-8?B?
thingy gets split into multiple lines, and then it goes terribly wrong because the domain name gets attached to every single split line:And we would even end up with strange email addresses like
<bmaster>@example.com>
instead of<webmaster@example.com>
; it just randomly depends on where the lines got split.I don't know which part of the Drupal or PHP code indiscriminately adds those domain names at the wrong places, but that needs to be dealt with too.
So, please do some more investigation, preferably with more test cases, especially with longer site names (e.g., a string with about 30 CJK characters would trigger this), so that no regression is introduced with corner cases.
Comment #50
cilefen CreditAttribution: cilefen as a volunteer commented#49 indicates it needs more test coverage.
Comment #51
Liam MorlandThanks. Does anyone know what code causes it to be split up?
Comment #52
Liam MorlandThose are likely added by the MTA.
Comment #53
das.gautam CreditAttribution: das.gautam as a volunteer commentedWe have been facing the same issue as mentioned in #49. Bigger site names get's split's into multiple lines.
I guess the following function is responsible for splitting.
This function can found at "/core/lib/Drupal/Component/Utility/Unicode.php"
Comment #54
Liam MorlandThanks, @das.gautam27.
Does anyone know why Unicode::mimeHeaderEncode() breaks up the string into 47 character chunks? I am about to write a version of this patch which doesn't do this, but perhaps there is a reason why this is done and the patch needs to do something else.Answer: It is a requirement of RFC 2047.
Comment #55
Liam MorlandRFC 2047 requires that the encoded-word (the
=?UTF-8?B?...?=
thing) be no longer than 75 characters in total, including the=?UTF-8?B?
and?=
. Unicode::mimeHeaderEncode() enforces this length, breaking things up as needed.What I do not know is what a From header is supposed to look like if display-name is long enough that it needs an encoded-word that is longer than 75 characters. It may be that it should do what it does with the current patch, but later on there is a bug which adds the domains, as observed in #49.
One possible partial solution: If the site name is long enough that more than one encoded-word is returned by Unicode::mimeHeaderEncode(), drop all but the first of these. This completely solves it for shorter site names and at least makes a valid header for longer ones. Only site names that require encoding and are too long would be truncated. Thoughts on this?
Comment #56
skylord CreditAttribution: skylord commentedNote, that according to mentioned RFC 2047:
So CRLF SPACE charchters don't need to be encoded and I fixed that bug for my projects (with very long russian words in email headers) with simple patch attached.
Comment #57
Liam MorlandWhat it is saying is that CRLF SPACE is needed to separate each encoded-word. Those characters still need to be encoded.
Comment #58
skylord CreditAttribution: skylord commentedNo, they don't, actually. Look at RFC 2822:
So, CRLF is OK in header field body.
Comment #60
alex73 CreditAttribution: alex73 commentedGuys, it's crazy situation.
Drupal is unusable since May 2016, more than year for today, because users can't receive emails at all, i.e. they can't register.
There is patch:
- $headers['From'] = $site_config->get('name') . ' <' . $site_mail . '>';
+ $headers['From'] = Unicode::mimeHeaderEncode($site_config->get('name')) . ' <' . $site_mail . '>';
This patch works good. But patch still is not in the code. What's happen ?
Comment #61
oriol_e9g@alex73 happens that the problem it's not correctly solved when you use long names (more than 78 characters). See comments from #49 to #58, we need to add tests and fix for large names.
Comment #62
Liam MorlandThis patch is like #42 except that for long names, only the first encoded-word is included. The effect is that this bug is completely solved for everything except long names that include special characters. In these cases, the site name will be cut off. This patch fixes the problem for most people without making it worse for anyone.
Comment #63
anna.radulovski CreditAttribution: anna.radulovski commentedWe are working on triaging this issue with Merel van Empel @emmezali
Comment #64
colorfulCoder CreditAttribution: colorfulCoder commentedJust commenting for notices. Whazzup!
Comment #65
colorfulCoder CreditAttribution: colorfulCoder commentedWhen triaging this issue we got stuck trying to reproduce the bug.
Steps we did to try and reproduce the bug:
- Install Drupal 8.5.x locally using Acquia Desktop
- Create a new account with actual real Gmail address via /admin/people/create (ticked the box "Notify user of new account")
No e-mail received. Our hypothesis is that that is because it's locally, not online.
So:
Next step was to use simplytest.me to create a Drupal8.5.x site and repeat the same steps (Create account + tick box)
No e-mail received.
What is needed is a guide to set up a drupal site that will be able to send e-mail to real Gmail adresses.
An educated guess on how to do this is to set up a mail server, and put the drupal installation on that same server.
We will not be doing that today, because this was only triage.
Comment #66
valthebaldTo track down/test the patch, you need some mail catching tool (check https://www.drupal.org/docs/develop/local-server-setup/managing-mail-han... for the options)
Comment #67
Liam MorlandYou could also add some debugging output to MailManager::doMail() and observe that the headers are not encoded properly.
Comment #68
gaboo CreditAttribution: gaboo commentedGetting the issue here, too. Lost a lot of time debugging this.
The easy fix for me was removing non ascii characters from the site name.
But a proper fix would be welcome ....
Comment #69
Liam MorlandPlease try the patch in #62.
Comment #70
alexpottLooking at another PHP mailer implementation https://github.com/swiftmailer/swiftmailer/blob/8388605460335d4c21481eb3... - shortening is exactly what that does. I like their variable name
$shorten
since I have to think less about what is actually happening.We need a test for a long site name.
Comment #71
alexpottAh so there is a test for long site names. I think we should do something like:
In the test because it makes it clearer that the truncation is successful. However we have another problem. Looking at SwiftMail when we shorten we need also to allow for the fact that
From:
takes some space on the line. As the RFC states:SwiftMail does this but we don't :( - in fact this is a bug in our implementation - so not something that needs to be addressed here. We should file a followup to say our PHPMail implement does not obey the RFC.
Comment #72
alexpottHere's a patch that does the suggestions in #71 - there's already an issue for the header length problem - #2407117: Unicode::mimeHeaderEncode() ignores header field name
Comment #73
Liam MorlandI think $first_chunk_only is more clear than $shorten, but I'm good with it either way. The additional test is good. What is the difference between assertEqual() and assertEquals()?
Comment #74
borisson_@Liam Morland,
assertEquals
is the phpunit unit way of doing this,assertEqual
is the older simpletest way (that still works because of a legacy trait.New code should use the
assertEquals
call.Patch in #72 looks very good, there's no bc break because the new functionality is behind a flag that defaults to FALSE. There's also sufficient new testcoverage and the existing tests don't break.
I also prefer
$shorten
instead of$first_chunk_only
- it's clearer to me what it's doing.Comment #75
larowlanPatch no longer applies sorry
Comment #76
yogeshmpawarComment #77
yogeshmpawarRerolled the patch #72 because it is failed to apply on 8.5.x branch.
Comment #78
alexpottThis comment is not quite right. The headers in vanilla Drupal are encoded by \Drupal\Core\Mail\Plugin\Mail\PhpMail::mail().
Comment #79
alexpottPatch to address #78.
Comment #80
jofitz CreditAttribution: jofitz at ComputerMinds commentedRemove tag.
Comment #81
Liam MorlandLooks good.
Comment #83
alexpotthttps://www.drupal.org/pift-ci-job/814524 failed for unrelated reasons.
Comment #85
alexpottThis issue is unlucky... another random fail due to the testbot being super slow.
Comment #86
larowlanAdding credit for reviewers
Comment #89
larowlanCommitted as 66fc180 and pushed to 8.5.x.
Cherry-picked as 9702106 and pushed to 8.4.x
Thanks all
Comment #90
Liam MorlandThank you!
Comment #92
cilefen CreditAttribution: cilefen as a volunteer commentedComment #93
allella CreditAttribution: allella as a volunteer commentedComment #94
cilefen CreditAttribution: cilefen as a volunteer commented