Problem/Motivation

In \Drupal\Core\Mail\Plugin\Mail\PhpMail::format() there is a conversion from HTML to text which occurs without checking if the input is an instance of MarkupInterface. This causes corruption of a mail that was in fact plain text and contains text that resembles HTML.

E.g.

Today I learnt Greek letters alphaβ they are hard to write.
In HTML, ampersand must be written as &.
I saw your house and <wow> it is great.
If a<b and b<c then a<c.

Steps to reproduce

Install a contrib module that allows sending of messages in plain text format. An example is simplenews. Type a message like the example above.

Proposed resolution

Only call htmlToText() if the input implements MarkupInterface.

Remaining tasks

None.

User interface changes

None.

API changes

None.

Data model changes

None.

Release notes snippet

None.

Comments

AdamPS created an issue. See original summary.

mxr576’s picture

Hm, I have uploaded a patch to the referenced https://www.drupal.org/project/drupal/issues/2223967 which I believe does exactly the same that you describe but I got a push back on it.

https://www.drupal.org/project/drupal/issues/2223967#comment-13602092

These are exactly the same patches.

adamps’s picture

Thanks @mxr576 however that patch is not the same as this issue.

The patch alters MessageViewBuilder.php in the contact module. The issue summary proposes to alter PhpMail::format().

NitinLama’s picture

Assigned: Unassigned » NitinLama
Status: Active » Needs review
StatusFileSize
new907 bytes

Status: Needs review » Needs work

The last submitted patch, 4: 3174760-4.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

NitinLama’s picture

Status: Needs work » Needs review
StatusFileSize
new1.51 KB

Status: Needs review » Needs work

The last submitted patch, 6: Mail_HTML_entities-3174760-6.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

snehalgaikwad’s picture

Assigned: NitinLama » Unassigned
Status: Needs work » Needs review
StatusFileSize
new1.51 KB
snehalgaikwad’s picture

StatusFileSize
new565 bytes
adamps’s picture

Status: Needs review » Needs work

Thanks. However it's more complex than that. The message body is an array, and each entry in the array is either MarkupInterface or a string. So something like this (not tested):

foreach ($message['body'] as &$part) {
  if ($part instanceof MarkupInterface) {
    $part = MailFormatHelper::htmlToText($part);
  }
}
kishor_kolekar’s picture

Status: Needs work » Needs review
StatusFileSize
new1.65 KB
new1.14 KB

please review the patch.

Status: Needs review » Needs work

The last submitted patch, 11: 3174760-11.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

kishor_kolekar’s picture

Status: Needs work » Needs review
StatusFileSize
new1.74 KB
new480 bytes
adamps’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

Thanks. Please see MailInterface::format() for a very clear description of what this patch needs to do.

The conversion process consists of the following steps:
   * - If the output is HTML then convert any input line that is a string using
   *   \Drupal\Component\Utility\Html\Html::Escape().
   * - If the output is plain text then convert any input line that is markup
   *   using \Drupal\Core\Mail\MailFormatHelper::htmlToText().
   * - Join the input lines into a single string.
   * - Wrap long lines using \Drupal\Core\Mail\MailFormatHelper::wrapMail().
if(is_array($message['body'])) {

The loop must go before the call to implode. Then you don't need this if line.

This needs a test before it can be committed.

snehalgaikwad’s picture

StatusFileSize
new1.98 KB
new1.29 KB

I have tried to change the process, converted the message first and then joined lines into single string. Please check and let me know if this way is correct. Also test cases are still need to add. If the approach is correct then we can move on to tests.Thank you.

adamps’s picture

Thanks it is getting better each time. Please delete this line and the else branch.
Html::Escape($part);
The comment says to do this if the output is HTML. However PHPMail always generates plain text output.

snehalgaikwad’s picture

StatusFileSize
new1.91 KB
new481 bytes

Thank you for suggestions! Updated patch as per #16.

kapilv’s picture

Status: Needs work » Needs review
adamps’s picture

Status: Needs review » Needs work

Code looks good. I think you don't need this line now.
+use Drupal\Component\Utility\Html;

Next step is to write tests - back to "Needs work" for that.

kapilv’s picture

Status: Needs work » Needs review
StatusFileSize
new638 bytes

Removed "use Drupal\Component\Utility\Html" Class.

kapilv’s picture

StatusFileSize
new1.14 KB

Removed "use Drupal\Component\Utility\Html" Class.

Status: Needs review » Needs work

The last submitted patch, 21: 3174760-21.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

NitinLama’s picture

Assigned: Unassigned » NitinLama
Status: Needs work » Needs review
StatusFileSize
new1.76 KB
new1.76 KB
NitinLama’s picture

Assigned: NitinLama » Unassigned

Updated patch as per #19 suggestion.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

adamps’s picture

Issue summary: View changes
Status: Needs review » Needs work

The patch is good, but it needs a test. The test should fail with an "only tests" patch. The test can send the example text from the issue summary.

Add another case inside mail_html_test_mail(), something like this

function mail_html_test_mail($key, &$message, $params) {
  switch ($key) {
    case 'render_from_message_param':
      $message['body'][] = \Drupal::service('renderer')->renderPlain($params['message']);
      break;

    case 'plain_from_message_param':
      $message['body'][] = $params['message'];
      break;
  }
}

Add another test in MailTest.php like this

    $message = "Today I learnt Greek letters alpha&beta; they are hard to write.\nIn HTML, ampersand must be written as &amp;.\nI saw your house and <wow> it is great.\nIf a<b and b<c then a<c.";
    \Drupal::service('plugin.manager.mail')->mail('mail_html_test', 'plain_from_message_param', 'plain@example.com', $language_interface->getId(), ['message' => $message]);
vsujeetkumar’s picture

Status: Needs work » Needs review
StatusFileSize
new1.74 KB

Re-roll patch created for 9.3.x.

vsujeetkumar’s picture

StatusFileSize
new3.89 KB
new1.86 KB

Added tests, As advised in #27, Please have a look.

adamps’s picture

Title: Mails resembling HTML entities are corrupted » Mails resembling HTML are corrupted
Status: Needs review » Needs work
Issue tags: -Needs tests

Great thanks that's good progress. Here are my comments:

1. You added the new test to testRenderedElementsUseAbsolutePaths() but it's not really related. I suggest to make a new test, maybe like this:

public function testResembleHtml() {

2. In MailTest.php this code is not needed - we know that $message is a plain string.

+use Drupal\Component\Render\MarkupInterface;


+    // Convert into plain text and assert.
+    if ($message instanceof MarkupInterface) {
+      $message = MailFormatHelper::htmlToText($message);
+    }

3. I see that there is the same bug with the mail subject - please can we fix that here too? The fix is in MailManager::doMail(). Instead of this code

        if ($message['subject']) {
          $message['subject'] = PlainTextOutput::renderFromHtml($message['subject']);
        }

Do this:

        if ($message['subject'] && ($message['subject'] instanceof MarkupInterface)) {
          $message['subject'] = PlainTextOutput::renderFromHtml($message['subject']);
        }

Then add a test for the subject also. You can take one of the four sentences out of $message and put it in the subject instead.

adamps’s picture

ankithashetty’s picture

Status: Needs work » Needs review
StatusFileSize
new4.61 KB
new3.5 KB

Updated patch in #29 addressing all the points made in #30, kindly review.

Thanks!

Status: Needs review » Needs work

The last submitted patch, 32: 3174760-32.patch, failed testing. View results

vsujeetkumar’s picture

Status: Needs work » Needs review
StatusFileSize
new5.56 KB
new1.34 KB

Fixed fail tests, Please have a look.

adamps’s picture

Status: Needs review » Needs work

Thanks to you both.

1.

--- a/core/modules/contact/tests/src/Functional/ContactPersonalTest.php
+++ b/core/modules/contact/tests/src/Functional/ContactPersonalTest.php
@@ -99,6 +99,7 @@ public function testSendPersonalContactMessage() {
+    $mail['subject'] = PlainTextOutput::renderFromHtml($mail['subject']);

No I don't think we can do that - the test is correct, so it means that the code is now broken. The problem is that the contact module is passing markup in something that isn't MarkupInterface. It looks like the problem is in contact_mail(): the concatenation operator (.=) forces the result back to a string.

In contact.module line 142 instead of this line:

      $message['subject'] .= t('[@form] @subject', $variables, $options);

do this:

      $message['subject'] = t('[@form] @subject', $variables, $options);

2. In testResembleHtml(), maybe better to take the sentence that's now in $subject out of $message so it's like this:

$subject = "Today I learnt Greek letters alpha&beta; they are hard to write.";
$message = "In HTML, ampersand must be written as &amp;.\nI saw your house and <wow> it is great.\nIf a<b and b<c then a<c.";
vsujeetkumar’s picture

Status: Needs work » Needs review
StatusFileSize
new5.54 KB
new2.6 KB

Changes has been done according to #35, Please have a look.

@AdamPS In contact.module I have done the changes on line# 161 in place of 142, Now the tests is pass.

adamps’s picture

Status: Needs review » Needs work

Thanks, looks good. In contact.module I think we need to change all three cases: lines 142, 151 and 161. You are right, only one is needed to pass the automatic tests, but they all have the same bug.

Then the next step is to upload an "Only tests" patch: it has the changes to tests and nothing else. The automatic tests should fail for this one of course.

vsujeetkumar’s picture

Status: Needs work » Needs review
StatusFileSize
new6.38 KB
new921 bytes
new1.31 KB

Changes has been done according to #37. Test only patch also added, Please have a look.

berdir’s picture

+++ b/core/lib/Drupal/Core/Mail/MailManager.php
@@ -305,7 +305,7 @@ public function doMail($module, $key, $to, $langcode, $params = [], $reply = NUL
         // subjects are plain strings.
-        if ($message['subject']) {
+        if ($message['subject'] && ($message['subject'] instanceof MarkupInterface)) {
           $message['subject'] = PlainTextOutput::renderFromHtml($message['subject']);
         }

+++ b/core/modules/contact/contact.module
+++ b/core/modules/contact/contact.module
@@ -139,7 +139,7 @@ function contact_mail($key, &$message, $params) {

I don't think the subject can ever be HTML? Why would we bother to check for that or expect it to be Markup? At best that seems like a side-effect of using t().

This was added in #2572597: Replace !placeholder with @placeholder in mail code

berdir’s picture

I think the change does make sense, but it's also an edge case between a bugfix and a change. People that are sending plain text mails might currently not pass in a proper Markup object and this change means that suddenly, their HTML will be full of HTML tags.

We did already several steps towards encouraging Markup objects, but so far, it was only to be able to actually send HTML mails.

I expect this is going to break mails for quite some custom code out there :-/.

berdir’s picture

Thinking out loud, could we turn it around and instead somehow flag message bodies to explicitly be plain text?

Status: Needs review » Needs work

The last submitted patch, 38: 3174760-38-test-only.patch, failed testing. View results

adamps’s picture

Issue summary: View changes
Status: Needs work » Reviewed & tested by the community
Issue tags: +Needs subsystem maintainer review, +Needs framework manager review

Thanks @Berdir

I don't think the subject can ever be HTML? Why would we bother to check for that or expect it to be Markup?

Importantly, there is all the code that realised it needed to pass in Markup to avoid corruption:-) For example see simplenews_token_replace_subject(). Without that code, this patch would be definitely non-BC, and almost impossible for contrib modules to write code that works accurately with multiple core versions. A second category is anyone who calls Token::replace() which outputs HTML (although annoyingly not as a MarkupInterface, which is a separate bug, #2580723: Fix token system confusion, with new function Token::replacePlain()).

People that are sending plain text mails might currently not pass in a proper Markup object and this change means that suddenly, their HTML will be full of HTML tags.

Could be, but the interface documentation is strongly in favour of this patch:

hook_mail() body:
The array may contain either strings or objects implementing \Drupal\Component\Render\MarkupInterface.

MailInterface::format():
The message body is received as an array of lines that are either strings or objects implementing \Drupal\Component\Render MarkupInterface

Furthermore, the popular swiftmailer module contains code exactly like this patch without any obvious sign of complaint. So perhaps there won't be as much broken contrib code as you fear??

Finally, I don't see how we can ever fix this bug without getting the people who are passing markup but not as MarkupInterface to change their code.

Thinking out loud, could we turn it around and instead somehow flag message bodies to explicitly be plain text?

Maybe, but it seems pretty tricky/messy. The body is an array and, at least in theory, each part could separately be markup or not. I guess the flag could be "please trust me and only call htmlToText() if I pass MarkupInterface". Then maybe we deprecate not passing the flag to give people a warning. And in D10 we deprecate the flag and everyone has to change their code again??

I propose that we should try to get comments from more key people. How about if we set this to RTBC with some appropriate issue tags and see what people say?

adamps’s picture

Issue tags: +Needs change record

I am struggling to imagine plasible contrib code that will be hit by this issue - can anyone give an example?

To hit the bug you need a string containing markup but not implementing MarkupInterface. Generally core would not generate such a thing so the contrib code would have to make it - which suggest that the coder has some awareness that they are making such a thing. Then to see the bug, pass this string as the body of a mail message - expecting the markup to be converted to plain.

Why would anyone create such a string containing markup only to pass it to a function that they want to remove the markup??

adamps’s picture

adamps’s picture

berdir’s picture

Status: Reviewed & tested by the community » Needs work

> Could be, but the interface documentation is strongly in favour of this patch:

IMHO, that documentation just documents that it is allowed to use strings or Markup objects, not what that specifically means. We should probably extend that documentation at least and make it clear what will happen, although that's then in the end at least partially up to the implementation.

> Why would anyone create such a string containing markup only to pass it to a function that they want to remove the markup??

I guess I'm less talking about contrib but custom code. What I imagine is that some code renders something or does token replacement or whatever that results in some HTML and currently, that HTML would be removed and converted to a plaintext version if passed in as a string. This is much less of an issue with swiftmailer, because if people are using Swiftmailer then they likely want to send HTML mails anyway. It might be somewhat far-fetched, but it _is_ a behavior change.

I have pondered about it a bit whether I should leave it at RTBC or not, but going to set back to needs work to update the documentation and have a change record. The mail API documentation is notoriously bad about what it expects and supports exactly, we both have been plagued by that. Lets make this a small step toward improving it :) Beside that, I agree with your suggestion, to set it to RTBC and get feedback from a framework maintainer. I don't think the mail component/subsystem has a dedicated maintainer though.

Btw, your thoughts would be welcome in #1803948: [META] Adopt the symfony mailer component in case you're not already following that. On how to approach moving swiftmailer/symfony mailer component into core with the long-term goal of replacing our current awkward hook based mailing system.

adamps’s picture

Thanks Berdir

MHO, that documentation just documents that it is allowed to use strings or Markup objects, not what that specifically means.

OK here's a more specific statement, from the comment on MailInterface::format(). Does this one seem any more convincing?

The conversion process consists of the following steps:
- If the output is HTML then convert any input line that is a string using \Drupal\Component\Utility\Html\Html::Escape().
- If the output is plain text then convert any input line that is markup using \Drupal\Core\Mail\MailFormatHelper::htmlToText().
- Join the input lines into a single string.
- Wrap long lines using \Drupal\Core\Mail\MailFormatHelper::wrapMail().

This is much less of an issue with swiftmailer, because if people are using Swiftmailer then they likely want to send HTML mails anyway.

I believe the same problem exists if using swiftmailer. If some code creates markup in a plain string and passes it to swiftmailer then their mail will be corrupted. If it is an HTML mail, then attributes will be double-escaped. If it is a plain text mail, then markup will appear in the mail.

but going to set back to needs work to update the documentation and have a change record

Makes sense, will do.

I guess if #3165762: Add symfony/mailer into core seems like the direction and the key people are not comfortable with this patch then we can just ignore the bug in this issue and ensure that the replacement system doesn't have the same bug.

adamps’s picture

Status: Needs work » Reviewed & tested by the community
Issue tags: -Needs change record

Change record done - back to RTBC as per #47.

The last submitted patch, 38: 3174760-38.patch, failed testing. View results

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

The last submitted patch, 38: 3174760-38.patch, failed testing. View results

The last submitted patch, 38: 3174760-38.patch, failed testing. View results

The last submitted patch, 38: 3174760-38.patch, failed testing. View results

The last submitted patch, 38: 3174760-38.patch, failed testing. View results

The last submitted patch, 38: 3174760-38.patch, failed testing. View results

The last submitted patch, 38: 3174760-38.patch, failed testing. View results

The last submitted patch, 38: 3174760-38.patch, failed testing. View results

sanduhrs’s picture

There's no subsystem maintainer for mail listed in MAINTAINERS.txt.

larowlan’s picture

I'd like to see a reply here from @Berdir to indicate he agrees with the reply in #47

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

The last submitted patch, 38: 3174760-38.patch, failed testing. View results

catch’s picture

Title: Mails resembling HTML are corrupted » [PP-1] Mails resembling HTML are corrupted
Status: Reviewed & tested by the community » Postponed
Related issues: +#3165762: Add symfony/mailer into core

I guess if #3165762: [PP-1] Add symfony/mailer into core seems like the direction and the key people are not comfortable with this patch then we can just ignore the bug in this issue and ensure that the replacement system doesn't have the same bug.

I kind of think we should do that, there's not really an easy way to communicate the change of behaviour here, whereas with a brand new API we could fix the bug as a side-effect and people will consciously need to switch over and (hopefully) check output then. Going to tentatively mark this postponed on that issue.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

kim.pepper’s picture

Status: Postponed » Active

#3165762: Add symfony/mailer into core is in so setting back to active. Might be closed now?

adamps’s picture

Status: Active » Fixed

I think yes, we should close it. The bug is fixed in the new Symfony Mailer @Mail plug-in. PhpMail still contains the bug, but I don't think it makes sense to fix that - it creates BC problems and the code will be removed when Symfony Mailer is adopted.

adamps’s picture

Title: [PP-1] Mails resembling HTML are corrupted » Mails resembling HTML are corrupted

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.