Problem description

For each mail sent, this module needs to compare the requested format (plain text/HTML) with the current format and maybe perform some conversion. Currently there are several bugs with this.

Part of the cause of these bugs is confusion over how to know the current format. I propose that the only reliable way is the standard "safe-markup" mechanism used throughout Drupal: if the content implements MarkupInterface then it is safe markup; if not then it must be escaped if used in HTML context.

We can look at each of the 4 possible cases in turn:

1. Input plain, output plain
- Correct: do nothing
- Actual: if the content contains characters that resemble HTML tags it will be treated as HTML and corrupted

2. Input plain, output HTML
- Correct: HTML encode (and possibly a little more filtering)
- Actual: since #2937293: Plain text html escape characters Problem code will call MailFormatHelper::htmlToText() instead.

3. Input HTML, output plain
- Correct: convert HTML to text
- Actual: no conversion

4. Input HTML, output HTML
- Correct: do nothing
- Actual: if the input contains no tags it will be HTML escaped leading to double-encoding of entities

1. and 4. were introduced by #2831959: HTML in Drupal core emails. 3. Has been present for a long time as far as I can see.

These are objective clear bugs and we can fix all of them without any disadvantages that I can see.

Secondary problems

In contrast these are more subjective points, and fixing them would perhaps have downsides for some users, or would be non-back-compatible. Now split into separate issues.

  1. #3125041: Clarify and simplify message settings
  2. #3125042: [META] New & improved format conversions

Proposed resolution

In the first case let's set aside the secondary problems. Therefore correct code for SwiftMailer::massageMessageBody() is

  1. If input is markup and output is plain, then call MailFormatHelper::htmlToText()
  2. If input is plain and output is HTML, then apply the plain text filter

We should write some better tests, but that's not possible until fixing #3106374: Count error if translation is missing.

Comments

AdamPS created an issue. See original summary.

adamps’s picture

Issue summary: View changes
berdir’s picture

You need to allow *some* HTML, for example links. Otherwise it's not possible to convert things like register/password covery mails with links in them to somewhat decent loooking e-mails.

plain text isn't the only safe format, you can't inject javascript into e.g. basic_html.

geek-merlin’s picture

@AdamPS Thanks for elaborating this. For me this is a wontfix in swiftmailer.
Imho applications are liable for passing filtered text to swiftmailer.
The standard case of which is using the formatter text datatype, which handles all permission and access issues.
If people want to have formatted user mails, provide an implementation for this (i'd recommend easy_email).
So no issue for swiftmailer. (Or am i missing something?)

adamps’s picture

Issue summary: View changes

#3 thanks @Berdir

You need to allow *some* HTML, for example links.

Sure - I agree that it's good to call a text formatter and it can contain the filter "Convert URLs into links" which I think plain_text does by default.

My objections are related to the details of how we do it.

plain text isn't the only safe format, you can't inject javascript into e.g. basic_html.

I agree that a default D8 clean install of a certain profile creates a text format called basic_html that is safe. In other situations is might be completely different. My point is that Drupal states that we need to check access before using a text format but we aren't.

adamps’s picture

#4

So no issue for swiftmailer. (Or am i missing something?)

Oops I didn't explain well then. Unfortunately yes I have not enabled you to understand my point so I have tried to improve the IS. My point is that in SwiftMailer::massageMessageBody() this module is making its own call to a formatter and I believe it's not doing it correctly.

The standard case of which is using the formatter text datatype, which handles all permission and access issues.

Agree

If people want to have formatted user mails, provide an implementation for this (i'd recommend easy_email).

Yes fine.

adamps’s picture

Issue summary: View changes

#3 basic_html might be safe but it would likely be a pretty bad idea. It would mean that the contact form would effectively become HTML but the person typing a message wouldn't realise that for example they needed to type & instead of &.

berdir’s picture

Just some quick note, will try to reply more later. I agree that we need to be very careful here, but I'd advise against doing too much, too quickly as it might break a lot of sites.

For example, convert URL's doesn't help if you want to do something like click here, it only converts raw URL's to links.

adamps’s picture

Issue summary: View changes

#8 Sure we definitely don't want to break sites.

For example, convert URL's doesn't help if you want to do something like click here, it only converts raw URL's to links.

Well if you have an actual link then your content is already HTML so it doesn't need to be converted. Provided that your $body implements MarkupInterface then it will work fine. If not then the HTML will get escaped. This aspect seems out-of-scope for this issue because it worked fine before and after #2831959: HTML in Drupal core emails and I'm not proposing to alter it.

berdir’s picture

I don't think that's out of scope, that's exactly what this setting allows you to do. Use HTML for configurable mails like user register e-mails. Maybe there could be a configuration of safe modules/mail keys or something.

adamps’s picture

Issue summary: View changes

I have improved the IS again.

Here is the commit from #2831959: HTML in Drupal core emails. It adds some code relating to the scenario that the target format is HTML and the content contains no tags.

On the other hand #8 as far as I understand is discussing the case if you put HTML links in the user register email so the content will contain tags. #2831959 didn't change that case.

However I agree that this is a desirable scenario for to cover. If $body implements MarkupInterface the HTML will remain in (however I suspect that's not true for user register mails). If not the HTML will be converted to text which is pretty strange if the target format is HTML.

So I think what you are asking for is a good idea, it's currently broken and this issue can help fix it.

adamps’s picture

Issue summary: View changes
geek-merlin’s picture

I still strongly think that we do not want to cover text format handling. Let me elaborate it more and socratically.

Mrs. Swift: Our module does one thing, sending emails and it does it well. The contract is that you pass in text in the right format, and we focus on sending that. If we get plain text, we send it as such. If we get markup, we send it as such.

Mr. Wantfeature: But when i have a core user mail sent, it does not look good and i want you to add logic for that.

Mrs. Swift: Again, that's not our business and #2831959: HTML in Drupal core emails was a step in the wrong direction.
We want swiftmailer to be maintainable and focus on its core functionality, not privide a hacky way to fix a core legacy deficit.
When such text format mapping is wanted, you can do it in custom code or start a separate contrib module for that.

But you don't even have to. EasyEmail contains easy_email_override that lets you configure user emails with text format.
It also lets you enable formatted and tokenizable templates for any mail with ~5 LOC.

Now that there is at least on other module that provides the formatted-text bridge, there is no reason left for swiftmailer to go into that direction. But even if that were not the case (or you don't like that module which is perfectly fine), it is proof that that text format handling can be done in contrib thus should not be swiftmailer's business.

adamps’s picture

Title: Ensure safe and correct use of text formats » Remove conversions that are unreliable/unclear/arbitrary

@geek-merlin very funny and very true.

We are starting a new 2.x branch and it's the ideal time to remove some legacy dead wood from the module. There is currently some black magic with little explanation in the UI or code. There will likely be some resistance but in the end hopefully people will agree that the overall system is clearer, more reliable and easier to use.

People might say to us "who are these crazy new guys to force users of this module in a certain direction". Well firstly we are offering to maintain it and no one else was. Secondly to those people, please comment on this issue now - we are very open to listen. Or if you don't see this issue in time, you can raise issues later and we can make it better.

I don't think we necessarily have to be entirely against conversions. For example, it could be OK for this module to have a setting "convert all mails to HTML". In this case we would take mails that are plain text plus are labelled as plain text and then convert them to mails that are HTML plus labelled as HTML.

adamps’s picture

Title: Remove conversions that are unreliable/unclear/arbitrary » Remove conversions that are unreliable/unclear/compensatory
Issue summary: View changes
adamps’s picture

Issue summary: View changes
Related issues: +#2893606: Generating HTML emails

Added related issue #2893606: Generating HTML emails where people are getting caught out by exactly this code.

adamps’s picture

Title: Remove conversions that are unreliable/unclear/compensatory » Remove conversions/corrections that can corrupt data
Issue summary: View changes

Updated the IS to clarify what I think we are trying to do here.

adamps’s picture

Title: Remove conversions/corrections that can corrupt data » Content type conversion is inaccurate, insecure, inflexible
Issue summary: View changes

OK, I have checked in more detail. Almost all of the problems come from when "Respect provided e-mail format" is set to FALSE. So I have made this issue specifically about that and will create a separate one for the other problem.

Given this is an optional setting then the situation is more easily managed. We can add a warning about the problems and if people choose to continue using it they accept the downsides.

See updated IS.

adamps’s picture

Issue summary: View changes
StatusFileSize
new6.04 KB

Here is a patch that hopefully finds a balance between back-compatibility and correctness. It feels that we must warn about the downsides and ensure that there is a way for sites to avoid them.

adamps’s picture

adamps’s picture

Title: Content type conversion is inaccurate, insecure, inflexible » Format selection/conversion is inaccurate/insecure
Issue summary: View changes

OK, finally now I think I get it. Come on everyone please tell me if I am on track. I have spent quite a few hours studying carefully but I wasn't involved in all the history so please correct me if needs be.

Especially @berdir thank you for the earlier comments and if you are still following please tell me what you think now.

I will post again in the issues that I propose to back out to give people to chance to object.

adamps’s picture

Issue summary: View changes
Related issues: +#2831959: HTML in Drupal core emails
adamps’s picture

Issue summary: View changes
heddn’s picture

Given the config option was hidden (it is still hidden, right?) and only changeable by someone who knows what they want to do, I'm not sure this is as serious of an issue as suggested. The module defaults to the most secure option, which is to plain text mails. If someone knows what they want to do and are OK using Basic HTML, that's what the hidden option allows developers do.

If the problem is how we determine to apply formatters, then I think if I understand what is proposed, we'd prefer to use a class check instead of scanning the string value. In theory, I'm not opposed to this proposal. But we'd want some form of testing showing how this would cover all cases. Because I'm not sure it does. I think that non MarkupInterface strings with html in them can make it to swiftmailer.

hansfn’s picture

I think that non MarkupInterface strings with html in them can make it to swiftmailer.

Yes, I was thinking about the same. I haven't tested how this change, using a class check, will affect modules like Views Send, but it seems like a breaking change - which I guess is fine; 2.x is still alpha.

adamps’s picture

Thanks both for some feedback. It has been interesting trying to figure this one out.

Regarding security yes you are right and I have mostly stopped worrying it. It's not mentioned in the IS any more.

Regarding "non MarkupInterface strings with html" I believe the current code will get HTML-escaped them (in the plain_text format) so they are already not working.

Regarding using "Basic HTML" yes I thought that too for a while and other people on other issues have suggested it. But I now think that's a wrong choice. The chosen format will be called with plain text that needs to be encoded and it's only called if there are no tags. Hence my point that we need to document things:-).

Testing for markup using strip_tags() is unreliable and it can corrupt messages. Any plain-text message that contains $lt; followed by a letter will be counted as markup and therefore corrupted. Any markup without a tag won't count, even if it contains & so any entities will be double-encoded. There's not really any way a site can prevent the corruption as it's built into swiftmailer. To me this doesn't seem good enough and that's what I propose we need to fix.

Starting 2.x seems like a chance to take a new course and get things right. Ideally any corrective processing to be at the application level - i.e. specific to the module/key. Module A always needs us to apply the default text format, so we apply it specifically there. However I'm open to persuasion on that count. In the IS search for "pragmatic reasons" where I have already mentioned that as a valid compromise.

adamps’s picture

Regarding "non MarkupInterface strings with html" I believe the current code will get HTML-escaped them (in the plain_text format) so they are already not working.

Sorry not in the plain_text format. It looks like they will be passed to MailFormatHelper::htmlToText($body) in dev or to Html::escape() in the beta. Either way they won't stay as markup.

There will be many normal cases where this module is passed "non MarkupInterface strings containing legitimate plain text that happens to contain a < . Surely we have to run HTML escaping in this case?

geek-merlin’s picture

@heddn #25: Implement the most secure option, +1.

geek-merlin’s picture

Status: Active » Needs review
StatusFileSize
new7.61 KB

Patch flying in to see how this looks.

adamps’s picture

Title: Format selection/conversion is inaccurate/insecure » Bugs with format conversion
Issue summary: View changes
Status: Needs review » Needs work

OK I have a plan. Let's fix the clear bugs first. Then we can consider the more subjective, non-BC changes later. I updated the IS for that.

adamps’s picture

Status: Needs work » Needs review
StatusFileSize
new2.48 KB

Status: Needs review » Needs work

The last submitted patch, 32: swiftmailer.respect_format.3122389-32.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

adamps’s picture

Status: Needs work » Needs review
StatusFileSize
new3.36 KB
new1.18 KB

Status: Needs review » Needs work

The last submitted patch, 34: swiftmailer.respect_format.3122389-34.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

adamps’s picture

Status: Needs work » Needs review
StatusFileSize
new2.95 KB
new851 bytes

Status: Needs review » Needs work

The last submitted patch, 36: swiftmailer.respect_format.3122389-36.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

berdir’s picture

can't keep up here, I'd appreciate not getting this in too quickly so I (and others I suppose) have a chance to catch up on all the discussion and changes. The issue title changes alone have my head spinning ;)

adamps’s picture

Issue summary: View changes
Status: Needs work » Needs review
StatusFileSize
new4.13 KB
new1.35 KB

@Berdir sure we can give plenty of time for review - in the meantime there are other issues to fix.

It took me a while to figure things out, but hopefully now I have and no more changing things. There is no need to read all my past comments. The IS is up-to-date and hopefully quite clear and concise. The proposed new code is hopefully simple and logical.

adamps’s picture

Issue summary: View changes
adamps’s picture

Priority: Major » Normal
Issue summary: View changes

I have now split the "secondary problems" into two separate issues - #3125041: Clarify and simplify message settings and #3125042: [META] New & improved format conversions. It is those that are potentially controversial and we should not rush in to. We need to give everyone a chance to understand and to comment.

PLEASE FOLLOW AND COMMENT ON THOSE ISSUES

With those parts removed this issue is a simple bug fix. I have reduced the priority to normal. There are some cases where conversions are done incorrectly - mostly HTML messages that look like plain text and vice versa. This issue is fully BC and it you look at should be entirely uncontroversial: look at the changes to tests (patch coming in next comment) and hopefully it's entirely obvious that the new version is better than the old.

adamps’s picture

adamps’s picture

OK, it would be great if we could review then commit this one as a priority please. Once we have done that I can make another alpha release that I believe is good enough to use on live sites and I will start testing with it. (It's an alpha release not an RC because we haven't finished making non-BC changes yet).

As per previous comment this issue is now fully BC and just fixes some bugs.

Please ignore the interdiff - it got broken because of another large commit in between. The new patch just adds the test changes.

geek-merlin’s picture

Status: Needs review » Reviewed & tested by the community

Code looks good to me, nice cleanup job!
Let's get this in and weed out the other stuff after that. +1 for splitting that out.

  • AdamPS committed 7f405fc on 8.x-2.x
    Issue #3122389 by AdamPS, geek-merlin, Berdir, hansfn, heddn: Bugs with...
adamps’s picture

Status: Reviewed & tested by the community » Fixed

Great thanks.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.