Follow-up to #2297703: [meta] Refactor and remove as many SafeMarkup::set() calls as possible

Remove SafeMarkup::set() from title, it's not needed with strip_tags() and if anything it makes things worse and unsafe if strip_tags fails to safely remove broken tags.

    $head_title = array(
      'title' => SafeMarkup::set(trim(strip_tags($page->getTitle()))),
      'name' => String::checkPlain($site_config->get('name')),
    );

Beta phase evaluation

Reference: https://www.drupal.org/core/beta-changes
Issue category Task because it does not fix any bugs or introduce any new features
Issue priority Major because the current implementation incorrectly implements Safemarkup
Prioritized changes The main goal of this issue is security and code cleanup.
Disruption Not disruptive
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

joelpittet’s picture

joelpittet’s picture

Status: Active » Needs review

Activate!

Status: Needs review » Needs work

The last submitted patch, 1: 2369987-1-safe-markup-set-1.patch, failed testing.

lauriii’s picture

Not sure if its possible to use SafeMarkup::escape for title because its still possible that theres gonna be some special characters and doesnt support UTF8 characters.

joelpittet’s picture

Status: Needs work » Postponed

@lauriii thanks, I meant to postpone this on the #2352155: Remove HtmlFragment/HtmlPage because it will conflict.

Wim Leers’s picture

Status: Postponed » Active

#2352155: Remove HtmlFragment/HtmlPage landed, this can now continue.

pgautam’s picture

Status: Active » Needs review
FileSize
1.73 KB

Patch.

Status: Needs review » Needs work

The last submitted patch, 7: 2369987-7-safe-markup-set-7.patch, failed testing.

aneek’s picture

The tests fail due to twig double escape in title tags.
The function should generate Edit Basic page <script>alert("xss")</script>csF2Z9AM | Drupal while generated is Edit Basic page <script>alert("xss")</script>pNdNdJSz | Drupal.

aneek’s picture

Created a patch to fix the double escaping. But I've used "renderer" service and #markup to print the page title. Also titles are run through SafeMarkup::escape() in this patch. But based on @lauriii's comment #4, I have also added a special character test case to check a node with special characters in the title. So far it worked.

But I think there can be a more better solution rather than using #markup, which itself runs SafeMarkup::set() @ #2273925: Ensure #markup is XSS escaped in Renderer::doRender().

aneek’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 10: 2369987-10-safe-markup-set.patch, failed testing.

aneek’s picture

Patch #10 introduces a security hole in titles for XSS. Need to change the logic here may be. :-(

aneek’s picture

With a different approach. Let's see if bot returns green :-)

aneek’s picture

Assigned: Unassigned » aneek
Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 14: 2369987-14-safe-markup-set.patch, failed testing.

Status: Needs work » Needs review
aneek’s picture

Okay now. Bot returned Green in second attempt. Anyway, review please. Do you think this could be the right approach to address this issue?

Thanks!

idebr’s picture

@aneek The inline template approach looks very clean! It seems the test from #10 got lost along the way, or was this intentional?

joelpittet’s picture

  1. +++ b/core/includes/theme.inc
    @@ -1382,12 +1382,13 @@ function template_preprocess_html(&$variables) {
    -  $separator = '';
    -  foreach ($head_title as $item) {
    -    $output .= $separator . SafeMarkup::escape($item);
    -    $separator = ' | ';
    -  }
    -  $variables['head_title'] = SafeMarkup::set($output);
    +  $build = array(
    +    '#type' => 'inline_template',
    +    '#template' => '{{ head_title|safe_join(" | ") }}',
    +    '#context' => array('head_title' => $head_title),
    +  );
    

    Love it, I think this is a great use of inline_template.

  2. +++ b/core/includes/theme.inc
    @@ -1382,12 +1382,13 @@ function template_preprocess_html(&$variables) {
    +  $output = \Drupal::service('renderer')->render($build);
    

    Is this the new drupal_render()?

idebr’s picture

@joelpittet drupal_render() has been marked deprecated and currently is just a wrapper around the renderer service:

function drupal_render(&$elements, $is_recursive_call = FALSE) {
  return \Drupal::service('renderer')->render($elements, $is_recursive_call);
}
joelpittet’s picture

I feel like I half knew this... but that is cool regardless, thanks @idebr. Swappable now:) So I guess we'll be doing that call directly now off the service.

tstoeckler’s picture

Is the explicit call to the renderer needed at all? AFAIK Twig checks whether a thing is a render array before printing it and calls the renderer itself. That would also allow for alterability in other preprocess functions.

joelpittet’s picture

Status: Needs review » Needs work

@tstoeckler Good call, unless we need that for some stringy stuff later, we can leave it as a renderable array and avoid the early call to drupal_render()/renderer::render()

aneek’s picture

@joelpittet & @idebr thanks for reviews. Joel, I think that we do need the renderer call. But I will check with keeping the inline_template call as array. I think it will print "Array" instead of printing the title.
But today I'll have a look.
@idebr, yes, I removed the test, it was just to ensure that special characters are properly printed with usage of SafeMarkup::escape calls.

aneek’s picture

FileSize
768 bytes
1.82 KB

Based on @tstoeckler, comment created a new patch and working fine I think. (PageTitleTest.php gave me some warning). Lets see what the test bot has to offer.

Thanks!

aneek’s picture

Status: Needs work » Needs review
aneek’s picture

Okay bot returned Green. Any more suggestions? I think this type of fix can be adopted to fix other SafeMarkup::set() removals as well. What do you guys think??

Thanks!

idebr’s picture

Issue summary: View changes
Status: Needs review » Reviewed & tested by the community
FileSize
184.76 KB

Thanks Aneek, I did a manual test to confirm the inline template properly escapes the title exactly once:

I have updated the issue summary to include a beta evaluation.

alexpott’s picture

Status: Reviewed & tested by the community » Fixed

Committed 45268c1 and pushed to 8.0.x. Thanks!

Thanks for adding the beta evaluation to the issue summary.

  • alexpott committed 45268c1 on 8.0.x
    Issue #2369987 by aneek, joelpittet, idebr, pgautam: Remove SafeMarkup::...

  • alexpott committed 4a961b9 on 8.0.x
    Revert "Issue #2369987 by aneek, joelpittet, idebr, pgautam: Remove...
alexpott’s picture

Status: Fixed » Needs work

Reverted this as this causes #2424743: Random testbot fail: "The page does not have double escaped HTML tags." in Drupal\field_ui\Tests\ManageFieldsTest. @aneek in #18 it would have been good if you could have documented why the patch failed the first time.

aneek’s picture

@alexpott, thanks. My mistake sorry :-(. I'll have a look why this happened. Sending #26 patch for re-test.

Edit #1
I created an article with title "&1e5&QE<" with the patch applied, everything seems to be fine. But if and only if I add a new field with the name "&1e5&QE<" then the title comes as double escaped. The value setting in the inline_template array is below,

array (
  '#type' => 'inline_template',
  '#template' => '{{ head_title|safe_join(" | ") }}',
  '#context' => 
  array (
    'head_title' => 
    array (
      'title' => '&amp;1e5&amp;QE&lt; settings for Article',
      'name' => 'D8',
    ),
  ),
)

And in the twig layer it's generating
&amp;amp;1e5&amp;amp;QE&amp;lt; settings for Article | D8
So somehow & is double escaped to &amp;. We do need a way to tell twig that the string is previously escaped from the Drupal or may fix this in twig level?

Edit #2
@alexpott, I don't think the patch failed due to this code changes in #18. I've ran a re-test and again it passed. Not totally clear to me. :-(
Any ideas? Thanks!

aneek’s picture

While checking template_preprocess_html() in about line #1333 I had a debug statement of printing the title generated (trim(strip_tags($variables['page']['#title']))).

$tl = trim(strip_tags($variables['page']['#title']));
var_dump(SafeMarkup::isSafe($tl));

So this gives the following,

  1. A node with "&1e5&QE<" title gives bool(true) while viewing the node.
  2. A node with "&1e5&QE<" title gives bool(false) while in the edit page.
  3. A field with "&1e5&QE<" label gives bool(false) while in the field edit page.

Any ideas why is that?

idebr’s picture

@aneek NodeViewController::title() returns a string that is already checked with String::checkPlain():

  public function title(EntityInterface $node) {
    return String::checkPlain($this->entityManager->getTranslationFromContext($node)->label());
  }
aneek’s picture

@idebr, yes correct. But if you check public function form() in /core/modules/node/src/NodeForm.php then you will find,

if ($this->operation == 'edit') {
      $form['#title'] = $this->t('<em>Edit @type</em> @title', array('@type' => node_get_type_label($node), '@title' => $node->label()));
    }

The node title and the content type name is check plain'd but this returns bool(false) if it's checked via SafeMarkup::isSafe().

aneek’s picture

@alexpott, I debugged this a bit. Here are my findings.

Given node title: &1e5&QE<

In node edit form(public function form() in /core/modules/node/src/NodeForm.php)
the code to generate the title is,

if ($this->operation == 'edit') {
      $form['#title'] = $this->t('<em>Edit @type</em> @title', array('@type' => node_get_type_label($node), '@title' => $node->label()));
    }

While this is processed, this goes to the template_preprocess_html() function to generate the page title. Here the string is marked as safe in SafeMarkup::set(trim(strip_tags($variables['page']['#title']))).

In the template_preprocess_html() initially the RAW HTML title looks like Edit Article &amp;1e5&amp;QE&lt; but if we run this via SafeMarkup::isSafe() it returns FALSE. This should return TRUE.

I guess this is why SafeMarkup::set() code was used to mark the title string safe.

Now, if we remove the SafeMarkup::set() call as below,

diff --git a/core/includes/theme.inc b/core/includes/theme.inc
index b99fc6c..916b77e 100644
--- a/core/includes/theme.inc
+++ b/core/includes/theme.inc
@@ -1323,7 +1323,7 @@ function template_preprocess_html(&$variables) {
   // Construct page title.
   if (!empty($variables['page']['#title'])) {
     $head_title = array(
-      'title' => SafeMarkup::set(trim(strip_tags($variables['page']['#title']))),
+      'title' => trim(strip_tags($variables['page']['#title'])),
       'name' => String::checkPlain($site_config->get('name')),
     );
   }

while the string is passed to twig_drupal_escape_filter() method then inside the condition at line #246,

if (isset($return)) {
    if ($autoescape && SafeMarkup::isSafe($return, $strategy)) {
      return $return;
    }
    // Drupal only supports the HTML escaping strategy, so provide a
    // fallback for other strategies.
    if ($strategy == 'html') {
      return String::checkPlain($return);
    }
    return twig_escape_filter($env, $return, $strategy, $charset, $autoescape);
  }

the double escaping is happening. So in $autoescape && SafeMarkup::isSafe($return, $strategy), $autoescape returns TRUE and the other condition SafeMarkup::isSafe($return, $strategy) returns FALSE. So the string is again going via String::checkPlain().

Isn't it that a string like, A <em>safe</em> markup string. will return TRUE if passed via SafeMarkup::isSafe() but it's not in this case of the page titles. Actually the HTML tags are getting removed by strip_tags(). May be this is the cause?

Please let me know your thoughts. Am I missing something?

aneek’s picture

FileSize
541 bytes

A new patch. Added SafeMarkup::checkAdminXss(), this will check if the title is safe or not and apply filter admin to it. Since the code sequence is like,

'title' => SafeMarkup::checkAdminXss(trim(strip_tags($variables['page']['#title']))),

no tags will be present for checkAdminXss to remove due to strip_tags function. So nothing will be ever removed from the title and thus the title will also be marked as safe. So there will be no double escape.

If someone feels that this needs a more efficient approach. Please let me know. And Does it needs any testing?

Please review!!

aneek’s picture

aneek’s picture

Status: Needs work » Needs review

rteijeiro’s picture

Fixed coding standards for arrays.

ianthomas_uk’s picture

@rteijeiro Thanks for trying, but Drupal coding standards are to use the array() syntax, so the old patch was actually correct. See https://www.drupal.org/coding-standards#array . You've also included several unrelated changes (note how your patch is 21KB, but the previous patch is only 2KB).

Reviewers, please look at #41.

Mile23’s picture

Status: Needs review » Needs work
Issue tags: +Needs reroll

Needs reroll...

$ git apply 2369987-40-safe-markup-set.patch 
error: patch failed: core/includes/theme.inc:1323
error: core/includes/theme.inc: patch does not apply
rteijeiro’s picture

Status: Needs work » Needs review
FileSize
1.86 KB

Re-rolled!

Status: Needs review » Needs work

The last submitted patch, 47: 2369987-47-safe-markup-set.patch, failed testing.

jain_deepak’s picture

Status: Needs work » Needs review
FileSize
2.76 KB

Rerolled

joelpittet’s picture

Assigned: aneek » joelpittet
Status: Needs review » Needs work
Issue tags: -Needs reroll

Thanks for the rerolls @rteijeiro and @jain_deepak.

  1. +++ b/core/includes/theme.inc
    @@ -1062,6 +1062,7 @@ function template_preprocess_item_list(&$variables) {
    + *   - options: Additional $options elements used by the url() function.
    
    @@ -1767,7 +1766,7 @@ function drupal_common_theme() {
    -      'variables' => array('url' => NULL, 'title' => NULL),
    +      'variables' => array('url' => NULL, 'title' => NULL, 'options' => array()),
    

    These lines sneaked in the patch and are unrelated.

  2. +++ b/core/includes/theme.inc
    @@ -1279,24 +1280,22 @@ function template_preprocess_html(&$variables) {
    -  $output = '';
    -  $separator = '';
    -  foreach ($head_title as $item) {
    -    $output .= $separator . SafeMarkup::escape($item);
    -    $separator = ' | ';
    -  }
    -  $variables['head_title'] = SafeMarkup::set($output);
    +  $variables['head_title'] = array(
    +    '#type' => 'inline_template',
    +    '#template' => '{{ head_title|safe_join(" | ") }}',
    +    '#context' => array('head_title' => $head_title),
    +  );
    

    This is a clever approach, though this is completely markup/theming specific I'd rather move that directly into the template instead of fussing about with it in the preprocess.

joelpittet’s picture

I'm going to attempt a couple thing here, bare with me. But FYI there is no need to use SafeMarkup on HTML titles, because we can't put HTML in them anyway.

http://www.w3.org/TR/html401/struct/global.html#h-7.4.2

Titles may contain character entities (for accented characters, special characters, etc.), but may not contain other markup (including comments). Here is a sample document title:

joelpittet’s picture

Assigned: joelpittet » Unassigned
Status: Needs work » Needs review
FileSize
5.13 KB
4.47 KB

Ok going to be a bit bold here. Removes the check_plain and admin_xss stuff and let's the template deal with it as it comes(because it already escapes on print).

I deprecated $variables['head_title_array']because it's nicer if we just use {{ head_title|join(' • ') }} or {{ head_title.slogan }} and control over that presentation in the markup for TX.

akalata’s picture

Status: Needs review » Needs work

The comment for head_title in html.html.twig says "List of text elements...", but is that still the case? Looks like it was changed to a string.

I'm not knowledgeable enough to know if your nuking of the xss stuff is okay. :)

joelpittet’s picture

Status: Needs work » Needs review

@akalata yes it is a list, that's why it needs safe_join filter to joint the array elements.

joelpittet’s picture

Title: Remove SafeMarkup::set() from title on template_preprocess_html » Remove SafeMarkup::set() from 'head' title on template_preprocess_html
joelpittet’s picture

@akalata to be a bit more clear on what my intention is. I made head_title into the array. But treating it like a hash so it's not really a list of indexed elements and that's why you can do {{ head_title.slogan }} which I think reads better as not plural.

Does that sound sane?

lauriii’s picture

Status: Needs review » Reviewed & tested by the community

I think the change is clean and fixes the issue. There is also beta evaluation in the issue summary.

alexpott’s picture

Status: Reviewed & tested by the community » Fixed

Committed 10c8777 and pushed to 8.0.x. Thanks!

Thanks for adding the beta evaluation to the issue summary.

  • alexpott committed 10c8777 on 8.0.x
    Issue #2369987 by aneek, joelpittet, rteijeiro, jain_deepak, pgautam,...

  • alexpott committed 797f9d2 on 8.0.x
    Revert "Issue #2369987 by aneek, joelpittet, rteijeiro, jain_deepak,...
alexpott’s picture

Status: Fixed » Needs work

Core just failed for:

* Drupal\field_ui\Tests\ManageFieldsTest (514 pass(es), 1 fail(s), and 0 exception(s))
   - [fail] [Other] "The page does not have double escaped HTML tags." in FieldUiTestTrait.php on line 89 of Drupal\field_ui\Tests\ManageFieldsTest->fieldUIAddExistingField().

I ran Drupal\field_ui\Tests\ManageFieldsTest->fieldUIAddExistingField() 40 times before committing... seems like we didn't fix the random fail.

alexpott’s picture

I've run the test locally with the patch applied over 100 times and still no fails :(

alexpott’s picture

Status: Needs work » Needs review
FileSize
1.2 KB
5.67 KB

Found it - thanks @joelpittet.

Patch attached fixes the random fail by changing the default label generating method to randomManchineName. If someone wants to test field labels with HTML entities in the label then this should be tested explicitly.

catch’s picture

Status: Needs review » Reviewed & tested by the community

Good find, back to RTBC.

Fabianx’s picture

I opened #2487498: Make randomString always return a > to avoid random test fails as a follow-up to make random test failures a thing of the past (mostly) - at least for double escape issues.

alexpott’s picture

Status: Reviewed & tested by the community » Needs work

Hang on I have a better fix. Patch coming.

alexpott’s picture

Status: Needs work » Needs review
FileSize
1.92 KB
6.39 KB

New fix.

joelpittet’s picture

Status: Needs review » Reviewed & tested by the community

Nice, that should close off the follow-up too I think... @Fabianx or would you still like to explore that? May not be a bad idea to always do that so we can avoid other possible random test failures as they are tricky to track down.

This patch should be gold though;)

catch’s picture

Status: Reviewed & tested by the community » Fixed

Committed/pushed to 8.0.x, thanks!

  • catch committed ac474b9 on 8.0.x
    Issue #2369987 by aneek, alexpott, joelpittet, rteijeiro, pgautam,...
xjm’s picture

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.