Problem/Motivation

In Inmail, we are dealing with mail message processing.
In order to decode email header fields, we use \Drupal\Component\Utility\Unicode::mimeHeaderDecode() helper.

Encoded header value is defined as:
encoded-field = "=?" charset "?" encoding "?" encoded-text "?="

The helper function assumes that encoding value is given as uppercase and fails to decode the given string when it's provided as lowercase.

RFC 2047 allows those values to be provided as either uppercased or lowercased:

Both 'encoding' and 'charset' names are case-independent. Thus the charset name "ISO-8859-1" is equivalent to "iso-8859-1", and the encoding named "Q" may be spelled either "Q" or "q".

Proposed resolution

Let \Drupal\Component\Utility\Unicode::mimeHeaderDecode() consider lowercased encoding values as well.
Provide test coverage for lowercased cases.

Remaining tasks

  1. Create patch with test
  2. Review
  3. Commit

User interface changes

API changes

Data model changes

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mbovan created an issue. See original summary.

mbovan’s picture

Title: Unicode::mimeHeaderDecode() doesn't support lowercased charset » Unicode::mimeHeaderDecode() doesn't support lowercased encoding
Status: Active » Needs review
FileSize
1.43 KB
2.83 KB

Providing a fix and test-only patch.

I wanted to use the existing providerTestMimeHeader(), but couldn't get the excepted results as Unicode::mimeHeaderEncode() always returns uppercased charset/encoding values.

Also, it seems this affects all Drupal versions, starting from 4.5 as it was introduced in #32465: mime_decode for core.

The last submitted patch, 2: mime_header_decodelowercase-2822334-2-TEST-ONLY.patch, failed testing.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.6 was released on August 2, 2017 and is the final full bugfix release for the Drupal 8.3.x series. Drupal 8.3.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.4.0 on October 4, 2017. (Drupal 8.4.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.4.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.4 was released on January 3, 2018 and is the final full bugfix release for the Drupal 8.4.x series. Drupal 8.4.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.5.0 on March 7, 2018. (Drupal 8.5.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.5.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.6 was released on August 1, 2018 and is the final bugfix release for the Drupal 8.5.x series. Drupal 8.5.x will not receive any further development aside from security fixes. Sites should prepare to update to 8.6.0 on September 5, 2018. (Drupal 8.6.0-rc1 is available for testing.)

Bug reports should be targeted against the 8.6.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.8.x-dev

Drupal 8.6.x will not receive any further development aside from security fixes. Bug reports should be targeted against the 8.8.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.9.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.7 was released on June 3, 2020 and is the final full bugfix release for the Drupal 8.8.x series. Drupal 8.8.x will not receive any further development aside from security fixes. Sites should prepare to update to Drupal 8.9.0 or Drupal 9.0.0 for ongoing support.

Bug reports should be targeted against the 8.9.x-dev branch from now on, and new development or disruptive changes should be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Kristen Pol’s picture

Version: 8.9.x-dev » 9.1.x-dev
Issue tags: +Bug Smash Initiative

Patch applies cleanly to 9.1.x.

jungle’s picture

Status: Needs review » Reviewed & tested by the community

The patch is vaild to 9.1.x, RED/GREEN as expected. Per RFC 2047, the changes are good. And the patch is good to go to me.

quietone’s picture

Status: Reviewed & tested by the community » Needs work

The IS looks fine and is up to date, yay! I didn't see any questions unanswered in the issue.

Looking at the patch , I found a few items.

  1. +++ b/core/tests/Drupal/Tests/Component/Utility/UnicodeTest.php
    @@ -75,6 +75,39 @@ public function testMimeHeader($value, $encoded) {
    +   * @see testMimeHeaderDecode()
    

    Is this necessary? I don't add an @see in the data providers that I have written.

  2. +++ b/core/tests/Drupal/Tests/Component/Utility/UnicodeTest.php
    @@ -75,6 +75,39 @@ public function testMimeHeader($value, $encoded) {
    +    return array(
    +      // Upper cased base64 encoding.
    

    Needs to use [] syntax. (Shows how old this patch is).

jungle’s picture

+++ b/core/tests/Drupal/Tests/Component/Utility/UnicodeTest.php
@@ -75,6 +75,39 @@ public function testMimeHeader($value, $encoded) {
+      // Upper cased base64 encoding.
+      array('tést.txt', '=?utf-8?B?dMOpc3QudHh0?='),
+      // Upper cased quoted-printable encoding.
+      array('tést.txt', '=?UTF-8?Q?t=C3=A9st.txt?='),
+      // Lower cased base64 encoding.
+      array('tést.txt', '=?utf-8?b?dMOpc3QudHh0?='),
+      // Lower cased quoted-printable encoding.
+      array('tést.txt', '=?UTF-8?q?t=C3=A9st.txt?='),
+      // Simple ASCII characters.
+      array('ASCII', 'ASCII'),

Thanks, @quietone for reviewing. Addressing, #12 meanwhile, transferred comments into keys.

The last submitted patch, 13: 2822334-13-test-only.patch, failed testing. View results

quietone’s picture

Status: Needs review » Needs work

Using the comments as key seems to be the trend, and it is helpful.

One more thing.

+++ b/core/tests/Drupal/Tests/Component/Utility/UnicodeTest.php
@@ -26,6 +26,32 @@ public function testMimeHeader($value, $encoded) {
+      'Upper cased base64 encoding' => ['tést.txt', '=?utf-8?B?dMOpc3QudHh0?='],
+      'Upper cased quoted-printable encoding' => ['tést.txt', '=?UTF-8?Q?t=C3=A9st.txt?='],

Since some of the line are greater than 80 chars let's format them like this

      'Upper cased base64 encoding' => [
        'tést.txt',
        '=?utf-8?B?dMOpc3QudHh0?=',
      ],

which is much easier to read.

jungle’s picture

Status: Needs work » Needs review
FileSize
2.86 KB
1.08 KB

Thanks! @quietone.

These changes make it header to read than it was and in other places we've kept the single line. Why are we doing this here?

-- from https://www.drupal.org/project/drupal/issues/2875279#comment-13781707

alexpott seems prefer #13

I noticed it. The rule Drupal.Arrays.Array.LongLineDeclaration has not been applied in core yet. One liners seem compact and easier to read. Well, addressing.

jungle’s picture

FileSize
2.94 KB
1.21 KB

@quietone:
Thanks. I doubt that the mix of array (multiline vs single line) will be liked by committers.

All are multilines now.

Kristen Pol’s picture

Issue summary: View changes

Curious why ASCII is used twice.

+++ b/core/tests/Drupal/Tests/Component/Utility/UnicodeTest.php
@@ -44,17 +44,26 @@ public function testMimeHeaderDecode($value, $encoded) {
+        'ASCII',
+        'ASCII',

The only other thing is wording. IMO, it would be cleaner to use:

Uppercase instead of Upper cased

and

Lowercase instead of Lower cased

Otherwise Looks RTBC unless any of the items mentioned above should be changed.

1) Patch still applies cleanly.

2) Tests pass.

3) Includes new test to test the issue.

4) Patch addresses issue from issue summary.

5) Feedback from previous comments has been addressed.

6) Title and issue summary are clear.

longwave’s picture

Status: Needs review » Needs work

Agree with @Kristen Pol about Upper/Lower cased => Uppercase/Lowercase.

Also the new test case seems to be inserted between an existing test case and its data provider; would be better to keep them together and insert the new test after providerTestMimeHeader.

Is it also worth refactoring testMimeHeader() to test the encoding side only? There is some duplication now as both methods test the decoding method with the same set of data.

+++ b/core/tests/Drupal/Tests/Component/Utility/UnicodeTest.php
@@ -26,6 +26,47 @@ public function testMimeHeader($value, $encoded) {
+        'ASCII',
+        'ASCII',

This would fit better with the other test cases if this was "test.txt" I think, I realise this is copied from the other test but maybe worth changing there too?

jungle’s picture

Status: Needs work » Needs review
FileSize
3.12 KB
1.56 KB

@Kristen Pol, @longwave, thanks for reviewing.

Curious why ASCII is used twice.

See \Drupal\Component\Utility\Unicode::mimeHeaderEncode();

For headers consist of ASCII chars, it returns as the original. Or see http://www.rfc-editor.org/rfc/rfc2047.txt for more information.

+++ b/core/tests/Drupal/Tests/Component/Utility/UnicodeTest.php
@@ -26,6 +26,47 @@ public function testMimeHeader($value, $encoded) {
+        ],
+      'Upper cased quoted-printable encoding' => [

Addressing #18, #19 and fixing the wrong indentation above.

Status: Needs review » Needs work

The last submitted patch, 20: 2822334-20.patch, failed testing. View results

jungle’s picture

Status: Needs work » Needs review

Random fail. Requeued.

longwave’s picture

Status: Needs review » Reviewed & tested by the community

> Is it also worth refactoring testMimeHeader() to test the encoding side only?

Thinking again this is out of scope, and an extra two unit test passes don't take any time, so let's leave this alone.

All other review points addressed so RTBC assuming tests pass this time!

jungle’s picture

Thanks @longwave for RTBC-ing!

Thinking again this is out of scope, and an extra two unit test passes don't take any time, so let's leave this alone.

Sorry missed this.

testMimeHeader() covers both ::mimeHeaderDecode, ::mimeHeaderEncode previously, now testMimeHeaderDecode() added to test ::mimeHeaderDecode only. Renaming and refactoring testMimeHeader() to test ::mimeHeaderEncode only makes sense and it's in scope to me.

Addressing, but stay RTBC.

  • catch committed 1c7065d on 9.1.x
    Issue #2822334 by jungle, mbovan, longwave, Kristen Pol, quietone:...
catch’s picture

Title: Unicode::mimeHeaderDecode() doesn't support lowercased encoding » [backport] Unicode::mimeHeaderDecode() doesn't support lowercased encoding
Version: 9.1.x-dev » 8.9.x-dev

Committed 1c7065d and pushed to 9.1.x. Thanks!

This looks backportable to 9.0.x and 8.9.x, but we're in a freeze so just moving there for now.

jungle’s picture

FileSize
3.65 KB
659 bytes

@catch, thanks for committing!

The patch in #24 is vailid to 9.0.x. Rerolling one for 8,9.x

$ cat core/tests/Drupal/Tests/Component/Utility/UnicodeTest.php.rej
diff a/core/tests/Drupal/Tests/Component/Utility/UnicodeTest.php b/core/tests/Drupal/Tests/Component/Utility/UnicodeTest.php    (rejected hunks)
@@ -15,15 +15,13 @@
 class UnicodeTest extends TestCase {
 
   /**
-   * Tests multibyte encoding and decoding.
+   * Tests multibyte encoding.
    *
    * @dataProvider providerTestMimeHeader
    * @covers ::mimeHeaderEncode
-   * @covers ::mimeHeaderDecode
    */
-  public function testMimeHeader($value, $encoded) {
+  public function testMimeHeaderEncode($value, $encoded) {
     $this->assertEquals($encoded, Unicode::mimeHeaderEncode($value));
-    $this->assertEquals($value, Unicode::mimeHeaderDecode($encoded));
   }
 
   /**
jungle’s picture

+++ b/core/tests/Drupal/Tests/Component/Utility/UnicodeTest.php
@@ -23,15 +23,13 @@ public function testSetStatus() {
    * @dataProvider providerTestMimeHeader

@@ -44,9 +42,49 @@ public function testMimeHeader($value, $encoded) {
+   * @dataProvider providerTestMimeHeaderDecode

To pair with providerTestMimeHeaderDecode(), would be better renaming providerTestMimeHeader() to providerTestMimeHeaderEncode(), do it in a separate issue as it got committed to 9.1.x already or ignore it -- it does not matter keeping it untouched.

Tagging "Needs followup" please if necessary.

  • larowlan committed 31ff211 on 9.0.x authored by catch
    Issue #2822334 by jungle, mbovan, longwave, Kristen Pol, quietone:...

  • larowlan committed 7f3f6de on 8.9.x
    Issue #2822334 by jungle, mbovan, longwave, Kristen Pol, quietone, catch...
larowlan’s picture

Title: [backport] Unicode::mimeHeaderDecode() doesn't support lowercased encoding » Unicode::mimeHeaderDecode() doesn't support lowercased encoding
Status: Reviewed & tested by the community » Fixed

Backported the original patch to 8.9.x
Committed 7f3f6de and pushed to 8.9.x.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.