Problem/Motivation

XSS::filter() calls SafeMarkup::set(). It shouldn't because it is entangling filtering and safemarkup setting at the wrong level.

For example:

  public function fieldFilterXss($string) {
    // All known XSS vectors are filtered out by
    // \Drupal\Component\Utility\Xss::filter(), all tags in the markup are
    // allowed intentionally by the trait, and no danger is added in by
    // \Drupal\Component\Utility\HTML::normalize(). Since the normalized value
    // is essentially the same markup, designate this string as safe as well.
    // This method is an internal part of field sanitization, so the resultant,
    // sanitized string should be printable as is.
    //
    // @todo Free this memory in https://www.drupal.org/node/2505963.
    return SafeMarkup::set(Html::normalize(Xss::filter($string, $this->allowedTags())));
  }

Proposed resolution

Stop XSS::filter() and XSS::filterAdmin() from setting their results to be safe markup.

Remaining tasks

Commit.

User interface changes

None.

API changes

  • XSS::filter() and XSS::filterAdmin() no longer set their results to be safe markup.
  • Add SafeMarkup::filterXss() as a replacement. SafeMarkup::checkAdminXss already exists but is deprecated in favour of SafeMarkup::filterAdminXss()>/code>.

Beta phase evaluation

Reference: https://www.drupal.org/core/beta-changes
Issue category Task because Drupal works at the moment.
Issue priority Major because SafeMarkup::set() uses unnecessary amounts of memory because strings which don't need to be stored here are.
Prioritized changes The main goal of this issue is performance since we have pages like user permissions and module install that are close to our 64mb once additional roles and modules are available.
Disruption Slightly disruptive for contributed and custom modules because some things that were marked safe are no longer. All they need to do is swap to the SafeMarkup methods if the strings need to be marked as safe. They will no this if they have double escaping issues.
CommentFileSizeAuthor
#118 remove-2506195-118.patch732 bytesmbovan
#111 2506195.111.patch40 KBalexpott
#111 108-111-interdiff.txt1.82 KBalexpott
#108 2506195.108.patch39.95 KBalexpott
#108 99-108-interdiff.txt10.27 KBalexpott
2506195.108.patch39.95 KBalexpott
99-108-interdiff.txt10.27 KBalexpott
2506195.107.patch40.09 KBalexpott
99-107-interdiff.txt8.64 KBalexpott
#99 2506195.99.patch40.85 KBalexpott
#99 91-99-interdiff.txt10.45 KBalexpott
#91 remove-2506195-91.patch39.98 KBjoelpittet
#91 interdiff.txt3.62 KBjoelpittet
#88 2506195.88.patch39.97 KBalexpott
#88 80-88-interdiff.txt4.28 KBalexpott
#80 2506195.80.patch39.23 KBalexpott
#80 75-80-interdiff.txt6.22 KBalexpott
#75 2506195.74.patch39.03 KBalexpott
#75 72-74-interdiff.txt4.41 KBalexpott
#72 2506195.72.patch38.88 KBalexpott
#72 71-72-interdiff.txt2.8 KBalexpott
#71 2506195.71.patch38.77 KBalexpott
#71 69-71-interdiff.txt2.48 KBalexpott
#69 2506195.69.patch38.74 KBalexpott
#69 67-69-interdiff.txt25.62 KBalexpott
#67 2506195.67.patch26.74 KBalexpott
#65 2506195.65.patch26.73 KBalexpott
#65 62-65-interdiff.txt9.61 KBalexpott
#62 2506195.62.patch23.33 KBalexpott
#62 48-62-interdiff.txt2.53 KBalexpott
#48 2506195.48.patch23.34 KBalexpott
#48 43-48-interdiff.txt6.79 KBalexpott
#43 2506195.43.patch19.25 KBalexpott
#43 38-43-interdiff.txt11.98 KBalexpott
#38 2506195.38.patch15.58 KBalexpott
#38 35-38-interdiff.txt1.83 KBalexpott
#35 2506195.35.patch17.42 KBalexpott
#35 25-35-interdiff.txt14.99 KBalexpott
#25 2506195.25.patch9.1 KBalexpott
#25 21-25-interdiff.txt2.15 KBalexpott
#21 2506195.19.patch8.37 KBalexpott
#21 10-19-interdiff.txt4.09 KBalexpott
#10 2506195.10.patch7.61 KBalexpott
#8 3-8-interdiff.txt5.34 KBalexpott
#8 2506195.8.patch7.76 KBalexpott
#3 2506195.2.patch3.06 KBalexpott
#2 2506195.2.patch3.21 KBalexpott
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

alexpott’s picture

alexpott’s picture

FileSize
3.21 KB

First pass at doing this.

alexpott’s picture

FileSize
3.06 KB
alexpott’s picture

Status: Active » Needs review

Let's see if anything breaks.

The last submitted patch, 2: 2506195.2.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 3: 2506195.2.patch, failed testing.

alexpott’s picture

Issue tags: +SafeMarkup, +Security, +Performance

Tagging

alexpott’s picture

Status: Needs work » Needs review
FileSize
7.76 KB
5.34 KB

Fixing the test failures.

Status: Needs review » Needs work

The last submitted patch, 8: 2506195.8.patch, failed testing.

alexpott’s picture

Status: Needs work » Needs review
Related issues: +#2501403: Document SafeMarkup::set in Xss::filter
FileSize
7.61 KB

Need to find out where :memory:-prefix is coming from.

pwolanin’s picture

Can we rename checkAdminXss() to something like filterAdminXss()?

xjm’s picture

Re: #11, that's out of scope I think (and a BC break). We could deprecate it and provide a new name in a separate issue for 8.1.x but it is not really in scope for the beta.

xjm’s picture

+++ b/core/lib/Drupal/Component/Utility/Xss.php
@@ -29,14 +29,12 @@ class Xss {
-   * This code does five things:
+   * This code does four things:

This is my favorite part of this patch.

pwolanin’s picture

Looks good to me, but does this need a change notice?

alexpott’s picture

alexpott’s picture

Issue summary: View changes

Improved summary and added beta eval.

Fabianx’s picture

Status: Needs review » Reviewed & tested by the community

RTBC, looks great to me.

It is good to keep SafeMarkup coupled to SafeMarkup space.

xjm’s picture

Status: Reviewed & tested by the community » Needs work
  1. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -149,7 +149,38 @@ public static function escape($string) {
       public static function checkAdminXss($string) {
    ...
    +  /**
    +   * Filters HTML to prevent cross-site-scripting (XSS) vulnerabilities.
    +   *
    +   * This method is preferred to \Drupal\Component\Utility\Xss::filter() when
    +   * the result is being used directly in the rendering system.
    +   *
    +   * @param $string
    +   *   The string with raw HTML in it. It will be stripped of everything that
    +   *   can cause an XSS attack.
    +   * @param array $html_tags
    +   *   An array of HTML tags.
    +   *
    +   * @return string
    +   *   An XSS safe version of $string, or an empty string if $string is not
    +   *   valid UTF-8. The string has been marked safe. If the string has already
    +   *   been marked safe, it won't be escaped again.
    +   *
    +   * @see \Drupal\Component\Utility\Xss::filter()
    +   */
    

    Can we add to the documentation for checkAdminXss() with similar details, and also explain the difference between the two methods and have them reference each other?

    I also see now why @pwolanin wanted checkAdminXss() renamed, because the names between Xss and SafeMarkup are not parallel. I've thought about it and I think it would be in scope this issue to deprecate checkAdminXss() for 9.0.0 and make it a wrapper for a filterAdminXss() with the same functionality, because otherwise I think there is a DX regression with this patch. If we did that, the CR would also need to be updated.

  2. +++ b/core/lib/Drupal/Component/Utility/Xss.php
    @@ -49,7 +47,7 @@ class Xss {
    +   * @see \Drupal\Component\Utility\SafeMarkup::filterXss
    

    Minor: needs parens for api.d.o to be happy.

xjm’s picture

I also changed #2506195: Remove SafeMarkup::set() from Xss::filter() to be postponed on this issue.

cilefen’s picture

alexpott’s picture

Issue summary: View changes
Status: Needs work » Needs review
FileSize
4.09 KB
8.37 KB

Thanks for the review @xjm. Addressed both points 1 and 2 with the patch. I've used the new SafeMarkup::filterAdminXss() where the patch was changing stuff to use the new deprecated method.

Re #1 I've done the deprecation but thinking about it I wonder if we should suck it up an break the API because I think API's around security should be as simple and as clear as possible. There are only 11 usages in core.

Fabianx’s picture

Historical background:

- There is checkPlain and SafeMarkup::check().

Therefore: checkAdminXss was created in parallel to filterAdmin().

The reason is:

While checkPlain AND filterXss() filters no matter what, checkAdminXss() would bail if the string is safe.

Because of the String -> SafeMarkup due to PHP 7 conversion (which I favorited), now both checkPlain and check are on the same class.

----

Just some food for thought:

This needs a little thought overall on what the SafeMarkup API should be:

- There is legitimate use cases for checkPlain I know what I am doing, don't interfere.
- There is also legitimate use cases for: Please autoescape this string for me, if it was not marked as safe before, please escape it.

Overall for both cases there are cases that want to store things as safe strings and things that don't.

e.g. Alex Pott was totally right that marking auto-escaped strings as safe from the twig filter layer is ... interesting or rather just wrong ...

pwolanin’s picture

I prefer more consistency in the method names, so I am in favor of the patch and of just removing the method that's marked deprecated.

However, I think the patch needs some work in terms of docs. SafeMarkup::filterXss() will do nothing if the string is already marked safe, so you could not, for example, call it with an empty array as the second arg and be sure that you'd actually remove all HTML tags from the return value.

I guess that is the counter-argument Fabien is presenting as to why "check" is better in the name than "filter".

alexpott’s picture

FileSize
2.15 KB
9.1 KB

re #23 actually I think we can't do the isSafe check in filterXss because of the arguments problem - it's why we don't do it in Xss::filter atm. Let's just remove and document that. Since the current implementation is a security bug waiting to happen.

Also notice some double escaping on node/add because Xss::filterAdmin() no longer leads to a SafeMarkup::set().

star-szr’s picture

I'm wondering if there might be a semi-clever way we can audit to avoid potential regressions like these that aren't covered by tests:

Also notice some double escaping on node/add because Xss::filterAdmin() no longer leads to a SafeMarkup::set().

pwolanin queued 25: 2506195.25.patch for re-testing.

pwolanin’s picture

Issue tags: +Needs change record

Look like we'll need a follow-up to remove uses of the deprecated method.

since we're deprecating that and changing the behavior of the Xss class, this probably needs a change record

alexpott’s picture

Issue tags: -Needs change record

@pwolanin can you review the existing CR https://www.drupal.org/node/2506757?

pwolanin’s picture

Status: Needs review » Reviewed & tested by the community

Sorry, I missed that CR. Looks fine - this is not a huge change.

I like the better separation of concerns and ability to avoid polluting the safe strings list when appropriate.

David_Rothstein’s picture

Status: Reviewed & tested by the community » Needs work
  1. +   * This method is preferred to \Drupal\Component\Utility\Xss::filter() when
    +   * the result is being used directly in the rendering system.
    

    It's not clear what this means. If you look at the example conversions in the patch, in some cases it's pretty obvious they are "being used directly in the rendering system" but in other cases not obvious at all.

    Also this is missing documentation of the opposite direction (why would you choose to use Xss:filter() over this?) which I think would be more important since I would expect people to use SafeMarkup by default when they aren't sure which to choose. And there should be some documentation on Xss::filter() about this too.

  2. I don't understand why filterXss() and filterAdminXss() should behave differently from each other with regard to checking if a string is already marked safe before filtering it. Since both have "filter" in the name, I think they should both always perform filtering. Anything else is very confusing and could lead to security issues.
  3. I would name it filterXssAdmin rather than filterAdminXss, for parity with Drupal 7 and because XSS is the thing that's being filtered.
  4. This existing documentation on the SafeMarkup class:
    /**
     * Manages known safe strings for rendering at the theme layer.
     *
     * The Twig theme engine autoescapes string variables in the template, so it
     * is possible for a string of markup to become double-escaped. SafeMarkup
     * provides a store for known safe strings and methods to manage them
     * throughout the page request.
     *
     * Strings sanitized by self::checkPlain() or Xss::filter() are automatically
     * marked safe, as are markup strings created from render arrays via
     * drupal_render().
     *
     * This class should be limited to internal use only. Module developers should
     * instead use the appropriate
     * @link sanitization sanitization functions @endlink or the
     * @link theme_render theme and render systems @endlink so that the output can
     * can be themed, escaped, and altered properly.
     *
     * @see TwigExtension::escapeFilter()
     * @see twig_render_template()
     * @see sanitization
     * @see theme_render
     */
    class SafeMarkup {
    

    looks very out of date. Not all of it as a result of this patch necessarily, but some of it as a result of this patch :)

David_Rothstein’s picture

Expanding on #2 a bit, the existing API is awfully confusing also:

SafeMarkup::escape() => sometimes escapes
SafeMarkup::checkPlain() => always escapes
SafeMarkup::checkAdminXss() => sometimes filters

And then this patch adds a couple more.

Seems really important to have an API with a naming convention that makes clear what's actually going to happen to the string. I wouldn't shy away from breaking APIs for this, personally. In an ideal world, something like this:

SafeMarkup::checkPlain() => always escapes
SafeMarkup::filterXss() => always filters
SafeMarkup::filterXssAdmin() => always filters
SafeMarkup::escapeIfUnsafe() => sometimes escapes; presumably should be used rarely outside the theme system internal code + SafeMarkup internal code
SafeMarkUp::filterXssAdminIfUnsafe() => sometimes filters (is this even needed? - honestly seems like it shouldn't be)
pwolanin’s picture

@David_Rothstein - I think you have a valid point about the API being confusing and poor naming, but I'm not sure how much change we want to introduce.

The "sometimes filters" options are obviously there to try to reduce double escaping and maybe encourage developers to feel free to call them again if they are unsure. That seems liek a valid goal.

David_Rothstein’s picture

Right, I would say #32 is more a roadmap for a possible followup issue. For this issue I think checkAdminXss() could stay exactly as it is, but the new method should always filter, same as Xss::filterAdmin() does.

alexpott’s picture

Status: Needs work » Needs review
FileSize
14.99 KB
17.42 KB

@David_Rothstein I'm not sure that SafeMarkup::xssFilterAdmin() should always filter - because the list of allowed tags never changes. It is the output of Xss::filterAdmin() that is stored in SafeMarkup's safe list not the input. The XSS filter strips tags, it does not escape them. So at this point this check has nothing to do with double escaping but everything to do with performance. No matter how times you pass <script>alert('here');</script> to the function it will always be escaped.

However I do agree that the methods should be called SafeMarkup::xssFilter() and SafeMarkup::xssFilterAdmin(). I;ve also added more documentation in Xss::methods() to point to the SafeMarkup::methods().

This patch also changes all usage of checkAdminXss() to filterAdminXss() so we can see where is being used for further discussion.

alexpott’s picture

Issue tags: +D8 Accelerate London

Tagging

Status: Needs review » Needs work

The last submitted patch, 35: 2506195.35.patch, failed testing.

alexpott’s picture

Status: Needs work » Needs review
FileSize
1.83 KB
15.58 KB

Some unintended changes got into the last patch.

Status: Needs review » Needs work

The last submitted patch, 38: 2506195.38.patch, failed testing.

David_Rothstein’s picture

  1. @David_Rothstein I'm not sure that SafeMarkup::xssFilterAdmin() should always filter - because the list of allowed tags never changes. It is the output of Xss::filterAdmin() that is stored in SafeMarkup's safe list not the input.

    The issue is that the output is not consistent. When you call the xssFilterAdmin() function you should know that only the tags on that fixed list are left behind, but it's not the case with this patch:

      // This will strip the <marquee> tags.
      SafeMarkup::xssFilterAdmin('<marquee>text</marquee>');
    
      // This won't.
      SafeMarkup::xssFilterAdmin(SafeMarkup::xssFilter('<marquee>text</marquee>', array('marquee')));
    

    And it's inconsistent with what the equivalent-named method on the Xss class does:

      // This will strip the <marquee> tags.
      Xss::filterAdmin(Xss::filter('<marquee>text</marquee>', array('marquee')));
    

    I guess it's only a security issue if the caller of Xss:filter() allows an insecure tag (and if that happens it's risking a security issue no matter what) but it's still unexpected and inconsistent.

  2. However I do agree that the methods should be called SafeMarkup::xssFilter() and SafeMarkup::xssFilterAdmin().

    OK, I guess that is better than what I suggested, yes - this way it reads very similar to Xss:filter() and Xss::filterAdmin().

  3. So looking through the conversions in the patch, this one (and a couple other similar ones internal to the rendering system):
    --- a/core/lib/Drupal/Core/Render/Renderer.php
    +++ b/core/lib/Drupal/Core/Render/Renderer.php
    @@ -368,7 +368,7 @@ protected function doRender(&$elements, $is_root_call = FALSE) {
         if (isset($elements['#markup'])) {
           // @todo Decide how to support non-HTML in the render API in
           //   https://www.drupal.org/node/2501313.
    -      $elements['#markup'] = SafeMarkup::checkAdminXss($elements['#markup']);
    +      $elements['#markup'] = SafeMarkup::xssFilterAdmin($elements['#markup']);
    

    looks like it is appropriate to use the "only filter if it's unsafe" approach, since it's a final fallback to avoid trying to render something unsafe. So I think checkAdminXss should be kept for that, but xssFilterAdmin (which should be used in most places) should not do that check.

pwolanin’s picture

So you want to distinguish check and filter? But in fact "checkPlain" always filters, so that seems even less consistent.

Also - the @placeholders are only filtering if unsafe so people could use those without fear of double escaping an already-escaped string.

pwolanin’s picture

working on the test fail in the last patch waiting for alexpott

alexpott’s picture

Status: Needs work » Needs review
FileSize
11.98 KB
19.25 KB

I've reviewed all the current Xss::filter() calls for ones that need to be SafeMarkup::filterXss() - there is not test coverage of putting html into all of these places - and in fact in some places it is impossible - for example a user name. There is entity validation that prevents that.

@David_Rothstein #40.1 is a good point.

I've undeprecated checkAdminXss as @David_Rothstein suggests and added tests based on the situation outlined in #40.1

xjm’s picture

Having checkAdminXss(), xssFilterAdmin(), and xssFilter() is very confusing. At a minimum, we need more detailed documentation on all three of those methods explaining when you should use each.

With the current patch, we have all of these methods for sanitizing text in core:

Method Usage Which filter? Applies when? Marks safe?
SafeMarkup::escape() 14 calls Full escaping If not isSafe() Yes
SafeMarkup::checkPlain() 386 calls Full escaping Always Yes
SafeMarkup::xssFilter() 4 calls (so far) Caller-specified tag list (limited by default) Always Yes
SafeMarkup::xssFilterAdmin() 9 calls (so far) Permissive tag list Always Yes
SafeMarkup::checkAdminXss() 6 calls Permissive tag list If not isSafe() Yes
Xss;:filter() 37 calls Caller-specified tag list (limited by default) Always No
Xss::filterAdmin() 53 calls Permissive tag list Always No
AllowedTagsXssTrait
::fieldFilterXss()
13 calls Somewhat restricted tag list Always No
xjm’s picture

Further thoughts based on that table:

  1. Several folks have pointed out that escape() and checkPlain() basically do the reverse of you'd expect based on their respective names.
     
  2. checkPlain() and checkAdminXss() do not have parallel behavior in the current patch, so that also adds to the confusion.
     
  3. escape() and checkAdminXss() are the only methods that conditionally apply the relevant filters. As @David_Rothstein points out, that's very important information for the developer and could lead to missed sanitization or exploits if they're used incorrectly. Therefore, if we choose to retain these methods as they are, I think these methods should include *IfUnsafe() or something explicitly in the method name, even if it results in long method names, and I think that it is important enough to justify the disruption of removing or deprecating the current methods and provide renamed methods.
     
  4. Renaming checkPlain() would be very hard to justify and it's a Drupalism that's existed since forever, but all the other methods on SafeMarkup are new in D8.
     
  5. It's not immediately obvious why SafeMarkup::xssFilterAdmin()/Xss::filterAdmin() and SafeMarkup::checkPlain()/htmlspecialchars() have a third variant that only applies the filter if the string is not already safe, but Xss::filter() does not have such a variant.
     
  6. Based on points 1, 3, and 5 and the low usage counts for escape() and checkAdminXss(), is there any reason to actually even retain these as separate methods rather than removing them and inlining the isSafe() check?
     
Fabianx’s picture

+1 to cleaning this up

Just one comment:

I think checkPlain can be deprecated (even if not removed).

My hit list is we would ideally have:

- A function that always escapes and does not mark safe.
- A function that escapes if unsafe and does mark safe.

I am not sure we need:

- A function that always escapes and does mark safe.

I don't think we need:

- A function that escapes if unsafe and does not mark safe. (as before that was on the safe string list already anyway)

alexpott’s picture

I don't think we need a function that

A function that always escapes and does not mark safe.

That's inbuilt in PHP - it's htmlspecialchars

alexpott’s picture

FileSize
6.79 KB
23.34 KB

So how about we proceed here as follows since cleaning up the SafeMarkup methods related to html character escaping should be beyond the scope of removing SafeMarkup::set() from the XSS class.

The patch adds:

  • SafeMarkup::xssFilter() - always filters and always marks safe.
  • SafeMarkup::xssFilterAdmin() - always filters and always marks safe.

The patch deprecates for Drupal 8.0.0:

  • SafeMarkup::checkAdminXss() in favour of SafeMarkup::xssFilterAdmin() and tells users to use SafeMarkup::isSafe() if that is important to them. if we want we could mark this as to be removed in Drupal 9.0.0 but I think not having SafeMarkup::checkAdminXss() will make solving the escape/checkPlain issue easier.

The patch changes:

  • Xss::filter() no longer marks safe - it always filters just as in HEAD.
  • Xss::filterAdmin() no longer marks safe - it always filters just as in HEAD.
  • The only truly valid calls to SafeMarkup::checkAdminXss() were in \Drupal\Core\Render\Renderer so add a protected method to reproduce the functionality

Then we can followup this with two issues:

  1. Remove SafeMarkup::checkAdminXss() - we remove it before 8.0.0 because a clear API around markup security is important. If code has the need to only filter strings not marks as safe it can do the same as \Drupal\Core\Render\Renderer.
  2. Open a discussion what to do about SafeMarkup::checkPlain() and SafeMarkup::escape()
effulgentsia’s picture

My recommendation for changing what's in #44:

  • Rename SafeMarkup::escape() to SafeMarkup::autoEscape().
  • Rename SafeMarkup::xssFilter() to SafeMarkup::autoFilter() and make it apply only when not isSafe().
  • Rename SafeMarkup::xssFilterAdmin() to SafeMarkup::autoFilterAdmin() and make it apply only when not isSafe().
  • Remove SafeMarkup::checkAdminXss() as it's covered by above.

With these changes, we'd have:

A function that always escapes and does not mark safe.

- htmlspecialchars()
- Xss::filter()
- Xss::filterAdmin()

A function that escapes if unsafe and does mark safe.

- SafeMarkup::autoEscape()
- SafeMarkup::autoFilter()
- SafeMarkup::autoFilterAdmin()

I am not sure we need a function that always escapes and does mark safe.

I agree we don't need this for filtering, but I think SafeMarkup::checkPlain() might still has valid use cases. In any case, given its high usage count, we should punt any kind of deprecation of it to a separate issue.

I don't think we need a function that escapes if unsafe and does not mark safe.

Agreed, so this recommendation doesn't have them.

alexpott’s picture

re #49

Rename SafeMarkup::xssFilter() to SafeMarkup::autoFilter() and make it apply only when not isSafe().

This is not possible to do safely because xssFilter() has an argument that affects what is filters.

effulgentsia’s picture

#49 was an x-post with #48. Re #48:

The patch adds:

SafeMarkup::xssFilter() - always filters and always marks safe.
SafeMarkup::xssFilterAdmin() - always filters and always marks safe.

That's counter to "I am not sure we need a function that always escapes and does mark safe.", but I think you're right to have it. Because if a string was e.g., autoFilterAdmin()'d, some other code might want to filter() it further, with a reduced tag list, even if it is already safe by admin standards.

alexpott’s picture

re #49 and @David_Rothstein has already pointed out the problems if SafeMarkup::xssFilterAdmin() checks safe-ness before filtering.

effulgentsia’s picture

Yep, I agree that #48 is the right way to go, so ignore #49.

Even though it means there's a lack of symmetry between SafeMarkup::escape() (escape only if unsafe) and SafeMarkup::xssFilter(Admin)() (filter always), I think that's ok, because escape() is not idempotent, so needs that check, whereas xssFilter*() is idempotent (for the same tag list), so there's no harm in running it multiple times on the same string.

In other words for all 3 functions (escape(), xssFilter(), and xssFilterAdmin()), the following contract is satisfied:

  • You can call them multiple times on the same string and still end up with the string you want to print.
  • Calling any one of them on any string results in a safe to print string (free from XSS).
  • xssFilter() does what you expect: filters for the passed-in tags, even if the string has already been filtered for other tags.

And therefore, whether the implementation checks isSafe() or not is irrelevant to what the caller cares about.

Fabianx’s picture

I just want to point out that we have a strategy parameter within the safeStrings list and within isSafe() and ::set().

Within twig the strategy is used to auto-escape things differently when in different contexts (e.g. css, js, ...).

I had been thinking (but never came to a good conclusion) that strategy might be used for checking explicitly if its already check plained, then filtering further does not make sense, but if filtering something that was filtered with more permissions, then it could still filter it, because it was marked safe for the wrong context.

If we have a hierarchy that should work well (thought we might need to extend the drupalEscapeFilter to check for more strategies.

e.g.

check_plain => filterXss => filterAdminXss

The default strategy is 'html' and what check_plain and drupalEscapeFilter use right now - the to be introduced SafeStringInterface could (unlike Twig_Markup) also either contain a strategy (and for performance reasons allow a public value to access it)

So if something was filtered via XSS, but then is check_plained, it should still be check_plain()'ed.

So we would have:

check_plain: 'html' (and default for everything else)
filterXss: 'xss'
filterAdminXss: 'xss_admin'

If that is too much for this issue, we might consider it as a security hardening for another issue.

alexpott’s picture

For me #54 is way too much for this issue - it's out of scope - the point of this issue is to remove the circular dependency between XSS and SafeMarkup and allow people to xss filter without safemarkup setting.

Fabianx’s picture

Status: Needs review » Reviewed & tested by the community

Given #53 that filterXSS is kinda idempotent, this is RTBC.

The plan looks good to me.

cilefen’s picture

David_Rothstein’s picture

  1. --- a/core/lib/Drupal/Core/Render/Element/HtmlTag.php
    +++ b/core/lib/Drupal/Core/Render/Element/HtmlTag.php
    ...
         // Ensure what we are dealing with is safe.
         // This would be done later anyway in drupal_render().
    -    $prefix = isset($elements['#prefix']) ? SafeMarkup::checkAdminXss($elements['#prefix']) : '';
    -    $suffix = isset($elements['#suffix']) ? SafeMarkup::checkAdminXss($elements['#suffix']) : '';
    +    $prefix = isset($elements['#prefix']) ? Xss::FilterAdmin($elements['#prefix']) : '';
    +    $suffix = isset($elements['#suffix']) ? Xss::FilterAdmin($elements['#suffix']) : '';
    

    This one case looks like it actually should keep the old behavior (only filter if not yet marked safe), no?

  2. +   * This method is preferred to
    +   * \Drupal\Component\Utility\SafeMarkup::xssFilter() when the result is
    +   * not being used directly in the rendering system. For example, when its
    +   * result is being combined with other strings before rendering.
    

    The last sentence isn't a complete sentence. I think it could just turn into a parenthetical remark at the end of the first sentence. Could be fixed on commit.

  3. +   * This method is preferred to
    +   * \Drupal\Component\Utility\SafeMarkup::xssFilterAdmin() when the result is
    +   * not being used directly in the rendering system. For example, when its
    +   * result is being combined with other strings before rendering.
    

    Same as above.

  4. +    // This won't will strip the <marquee> tags and the string with html will be
    +    // marked as safe.
    

    "won't will" => "won't" (could be fixed on commit)

  5. +    // This won't will strip the <marquee> tags.
    

    Same as above.

Overall, sounds like there is a good plan for a followup issue. I did have a similar thought as @Fabianx, that the "only escape if it hasn't been escaped before" behavior could be made generally useful if the different escaping functions stored their escaped strings separately and didn't mix them together. Mixing them together is really the only thing that makes it problematic currently.

alexpott’s picture

#58.1 is correct however currently this is totally broken and insecure because element vs elements. Can we fix and test all of this in a follow up?

Fabianx’s picture

If the follow-up is opened, I would be fine with #59.

David_Rothstein’s picture

Sounds fine to me.

alexpott’s picture

lauriii’s picture

RTBC now

xjm’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +Needs change record updates

+1 for the updated solution; I think this is a better direction. Agreed about the current scope for the issue and deferring the rest to followups.

Overall, I think there's still some more work to do to clarify the documentation for the updated API. I'll try to work with @alexpott on this documentation later during the sprint. Here are my specific questions and suggestions just from my review:

  1. +++ b/core/includes/theme.inc
    @@ -1350,7 +1350,7 @@ function template_preprocess_page(&$variables) {
    -  $variables['site_slogan']       = (theme_get_setting('features.slogan') ? Xss::filterAdmin($site_config->get('slogan')) : '');
    +  $variables['site_slogan']       = (theme_get_setting('features.slogan') ? SafeMarkup::xssFilterAdmin($site_config->get('slogan')) : '');
    

    I thought we said that we wanted to avoid expanding the safe string list during theming and that most SafeMarkup use in the theme layer was suspect? Also this could use a render array instead? (I could see that being a followup since it's not directly in scope for this issue.)

  2. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,58 @@ public static function escape($string) {
        * Applies a very permissive XSS/HTML filter for admin-only use.
        *
    +   * This method is preferred to \Drupal\Component\Utility\Xss::filterAdmin()
    +   * when the result is being used directly in the rendering system. If the
    ...
    +   * Filters HTML to prevent cross-site-scripting (XSS) vulnerabilities.
    +   *
    +   * This method is preferred to \Drupal\Component\Utility\Xss::filter() when
    +   * the result is being used directly in the rendering system.
    

    Two questions about these docs:

    1. Isn't the most important point missing? Namely, that the result is marked safe and will not be escaped further during rendering.
    2. I'm a little confused by the sentence "This method is preferred when the result is being used directly in the rendering system." Isn't it the other way around--we want to not expand the safe list during rendering? I thought we had discussed that SafeMarkup should basically never be used anymore once the render and theme process was started.

      Examples/specifics would help, especially as to what "in the rendering system" means. We might also want to rework the overall documentation of SafeMarkup about this and the related changes (that could be a followup).

  3. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,58 @@ public static function escape($string) {
        * @see \Drupal\Component\Utility\Xss::filterAdmin()
    ...
    +   * @see \Drupal\Component\Utility\Xss::filter()
    

    I still think we need a definitive, clear place where all three SafeMarkup escaping methods reference each other and explain what the purpose of each is, in addition to referencing the API used for escaping.

  4. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,58 @@ public static function escape($string) {
    +   * @deprecated as of Drupal 8.0.x, will be removed before Drupal 8.0.0. Use
    +   *   \Drupal\Component\Utility\SafeMarkup::xssFilterAdmin() instead.
    

    Maybe we should expand this to mention (or reference) that the caller should consider whether or not to add the isSafe() check as well? Reading this, it sounds like xssFilterAdmin() is a direct replacement. While presumably one would eventually figure it out by reading the function code, it still is a bit misleading as it stands.

  5. +++ b/core/lib/Drupal/Component/Utility/Xss.php
    @@ -29,14 +29,17 @@ class Xss {
    +   * This method is preferred to
    +   * \Drupal\Component\Utility\SafeMarkup::xssFilter() when the result is
    +   * not being used directly in the rendering system (for example, when its
    +   * result is being combined with other strings before rendering).
    
    @@ -103,6 +106,11 @@ public static function filter($string, $html_tags = array('a', 'em', 'strong', '
    +   * This method is preferred to
    +   * \Drupal\Component\Utility\SafeMarkup::xssFilterAdmin() when the result is
    +   * not being used directly in the rendering system (for example, when its
    +   * result is being combined with other strings before rendering).
    

    Oh, so this bit clarifies my earlier confusion a little bit about using something "directly" in rendering or not, but I think it would be easier to understand if we explained to the developer why they should use one or the other: to avoid bloating the safe string list with partial strings if the whole result will be marked safe. However, it's more complicated than that as well -- if this is combined with (e.g.) SafeMarkup::format(), then the @ placeholder will not work. But if the caller combines a SafeMarkup::format() with SafeMarkup::xssFilter(), then they will get the results they expect for the @.

  6. +++ b/core/lib/Drupal/Component/Utility/Xss.php
    @@ -83,7 +86,7 @@ public static function filter($string, $html_tags = array('a', 'em', 'strong', '
    -    return SafeMarkup::set(preg_replace_callback('%
    +    return preg_replace_callback('%
    

    Awesome.

  7. +++ b/core/lib/Drupal/Core/Render/Element/HtmlTag.php
    @@ -173,8 +174,8 @@ public static function preRenderConditionalComments($element) {
    -    $prefix = isset($elements['#prefix']) ? SafeMarkup::checkAdminXss($elements['#prefix']) : '';
    -    $suffix = isset($elements['#suffix']) ? SafeMarkup::checkAdminXss($elements['#suffix']) : '';
    +    $prefix = isset($elements['#prefix']) ? Xss::FilterAdmin($elements['#prefix']) : '';
    +    $suffix = isset($elements['#suffix']) ? Xss::FilterAdmin($elements['#suffix']) : '';
    

    So, looking at this function, I would think that this counts as being used "directly within the rendering system", yet we are using Xss and not SafeMarkup. Is this because of #2525908: HtmlTag render element's prefix and suffix can be marked safe when they are not?

  8. +++ b/core/lib/Drupal/Core/Render/Renderer.php
    @@ -368,7 +368,7 @@ protected function doRender(&$elements, $is_root_call = FALSE) {
    +      $elements['#markup'] = $this->xssFilterAdminIfUnsafe($elements['#markup']);
    
    @@ -382,7 +382,7 @@ protected function doRender(&$elements, $is_root_call = FALSE) {
    +          $elements[$key] = $this->xssFilterAdminIfUnsafe($elements[$key]);
    
    @@ -468,8 +468,8 @@ protected function doRender(&$elements, $is_root_call = FALSE) {
    +    $prefix = isset($elements['#prefix']) ? $this->xssFilterAdminIfUnsafe($elements['#prefix']) : '';
    +    $suffix = isset($elements['#suffix']) ? $this->xssFilterAdminIfUnsafe($elements['#suffix']) : '';
    

    I started to point out that this is still essentially doing SafeMarkup::set() in Renderer::doRender(), but that's not this issue; it's [#2506851]. I'd like to do this issue before that one and then explore these either in that one or in a followup to both.

  9. +++ b/core/lib/Drupal/Core/Render/Renderer.php
    @@ -653,4 +653,21 @@ public function addCacheableDependency(array &$elements, $dependency) {
    +  protected function xssFilterAdminIfUnsafe($string) {
    

    +1 for making this protected on Xss.

  10. +++ b/core/modules/simpletest/src/Form/SimpletestResultsForm.php
    @@ -313,7 +313,7 @@ public static function addResultForm(array &$form, array $results) {
    -        $row[] = SafeMarkup::checkAdminXss($assertion->message);
    +        $row[] = SafeMarkup::xssFilterAdmin($assertion->message);
    

    Do we have a followup yet for what we discussed about cutting back all the excessive escaping for assertion messages in favor of just letting this escape things? (Not in scope; was just reminded.)

  11. +++ b/core/modules/system/src/Plugin/Block/SystemBrandingBlock.php
    @@ -173,7 +174,7 @@ public function build() {
    -      '#markup' => Xss::filterAdmin($site_config->get('slogan')),
    +      '#markup' => SafeMarkup::xssFilterAdmin($site_config->get('slogan')),
    

    Isn't this redundant following #2273925: Ensure #markup is XSS escaped in Renderer::doRender()?

  12. +++ b/core/modules/taxonomy/src/Controller/TaxonomyController.php
    @@ -54,7 +55,7 @@ public function addForm(VocabularyInterface $taxonomy_vocabulary) {
    -    return Xss::filter($taxonomy_vocabulary->label());
    +    return SafeMarkup::xssFilter($taxonomy_vocabulary->label());
    
    @@ -67,7 +68,7 @@ public function vocabularyTitle(VocabularyInterface $taxonomy_vocabulary) {
    -    return Xss::filter($taxonomy_term->getName());
    +    return SafeMarkup::xssFilter($taxonomy_term->getName());
    
    +++ b/core/modules/user/src/Controller/UserController.php
    @@ -161,7 +162,7 @@ public function userPage() {
    -    return $user ? Xss::filter($user->getUsername()) : '';
    +    return $user ? SafeMarkup::xssFilter($user->getUsername()) : '';
    

    Note to self: Look at the history of these lines. I think we were discussing the inconsistency of these in IRC the other day. There's a followup I think? Should we reference it in an @todo?

  13. +++ b/core/modules/views/src/Plugin/views/field/Field.php
    @@ -671,7 +671,7 @@ public function renderItems($items) {
    -        $separator = $this->options['multi_type'] == 'separator' ? SafeMarkup::checkAdminXss($this->options['separator']) : '';
    +        $separator = $this->options['multi_type'] == 'separator' ? SafeMarkup::xssFilterAdmin($this->options['separator']) : '';
    

    Note to self: look at the issue that converted this to the inline template with the safe join.

  14. +++ b/core/tests/Drupal/Tests/Component/Utility/SafeMarkupTest.php
    @@ -213,6 +213,30 @@ public function testReplace($search, $replace, $subject, $expected, $is_safe) {
    +  public function testAdminXss() {
    

    +1 for the added unit tests!

  15. Do we have the followup for discussing escape() and checkPlain() yet? I'd like to add @todo referencing it for this patch.
  16. I didn't check yet what usages remain for each of the various methods with the latest patch, but I think we should probably create a major followup for evaluating each and double-checking which should be changed. (Just related to the critical meta, not a blocking child of it.)
  17. Finally, I think https://www.drupal.org/node/2506757 needs to be updated for the deprecation of checkAdminXss() etc. We also should check the other SafeMarkup CRs and ensure they are updated for this.

Some of these points probably could be taken on by anyone without needing to wait on me or @alexpott. Otherwise I'll work with him more on the docs later. Thanks everyone! I feel pretty good about the direction this has taken.

alexpott’s picture

Status: Needs work » Needs review
FileSize
9.61 KB
26.73 KB
  1. Sure - changed to use #markup
  2. Added docs to @return - I tried to change the one line summary for the both methods it proved really tricky/impossible for xssAdminFilter(). I agree that we should open a followup to improve SafeMarkup docs.
  3. Yep - let's do this in a followup and document everything on the class level.
  4. Fixed
  5. Added more docs
  6. Thanks :)
  7. Yep - let's handle that in #2525908: HtmlTag render element's prefix and suffix can be marked safe when they are not because currently it is totally broken - that issue has tests to prove it
  8. Yep I agree let's get this done before #2506581: Remove SafeMarkup::set() from Renderer::doRender
  9. Thanks :)
  10. Yes #2514044: Do not use SafeMarkup::format in exceptions and #2501319: Remove SafeMarkup::set() in Error::renderExceptionSafe() and _drupal_log_error() and improve backtrace formatting consistency in _drupal_log_error(), Error::renderExceptionSafe(), and DefaultExceptionSubscriber::onHtml ()
  11. Yes changed to #markup and opened issue to add tests - #2526458: Test XSS filtering of slogan in SystemBrandingBlock
  12. Well... the user name one is hilarious because you can't add html tags to a user name cause entity validation and the getUserName says the result should be checkPlain'd. The taxonomy one is only ever actually called from forum - and I'm not 100% certain the vocabulary one is every called. In short this is a bit of a mess - but I think this should all be other issues (not followups to this issue) since this issue is not making any functional change here.
  13. ...
  14. Yeah I took @David_Rothstein's nice pseudo code and ensured we have a test that confirms that this works how we think it does :)
  15. Yes #2526456: SafeMarkup::checkPlain() and SafeMarkup::escape()'s behaviour is confusing
  16. I've looked at the usages of Xss::filter, Xss::filterAdmin, SafeMarkup::xssFilter and SafeMarkup::xssFilterAdmin and I think that they are all correct.
  17. Good point let's update this on commit.

The template_preprocess_node_add_list() is completely unused so removed as converting to SafeMarkup breaks the documentation that this patch adds.

Status: Needs review » Needs work

The last submitted patch, 65: 2506195.65.patch, failed testing.

alexpott’s picture

Status: Needs work » Needs review
FileSize
26.74 KB

Rerolled.

Status: Needs review » Needs work

The last submitted patch, 67: 2506195.67.patch, failed testing.

alexpott’s picture

Status: Needs work » Needs review
FileSize
25.62 KB
38.74 KB

Well that shows what I know - I got confused with node/add and admin/structure/types both of which list node types.

Also I think @xjm is correct we should review all Xss::filter/filterAdmin usages in core.

For example:

        $content = Xss::filterAdmin(SafeMarkup::format($this->config->get('system.maintenance')->get('message'), array(
          '@site' => $this->config->get('system.site')->get('name'),
        )));
        $response = $this->bareHtmlPageRenderer->renderBarePage(['#markup' => $content], $this->t('Site under maintenance'), 'maintenance_page');

in MaintenanceModeSubscriber::onKernelRequestMaintenance is really interesting. We could completely avoid the safe string list here if we had ::format that did not mark strings as safe.

  /**
   * {@inheritdoc}
   */
  public function getOutput() {
    $output = '<h2 class="tour-tip-label" id="tour-tip-' . $this->getAriaId() . '-label">' . SafeMarkup::checkPlain($this->getLabel()) . '</h2>';
    $output .= '<p class="tour-tip-body" id="tour-tip-' . $this->getAriaId() . '-contents">' . Xss::filterAdmin($this->token->replace($this->getBody())) . '</p>';
    return array('#markup' => $output);
  }

I think the Xss::filterAdmin() here can be removed. Opened #2527360: Review all usages of Xss::filter(), Xss::filterAdmin(), and Html::escape() to discuss and move of this set of issues.

The patch attached completely removes SafeMarkup::xssAdminFilter() since actually the calls to this method are always completely unnecessary. The reasons for this are best explained by the documentation added to SafeMarkup::xssFilter().

  /**
   * Filters HTML for XSS vulnerabilities and marks the result as safe.
   *
   * This method should be used instead of
   * \Drupal\Component\Utility\Xss::filter() when the result is being added to a
   * render array that is construct before rendering begins. Calling this method
   * unnecessarily will result in bloating the safe string list and increases
   * the chance of unintended side-effects.
   *
   * If twig receives a value that is not marked as safe then it will
   * automatically encode special characters in a plain-text string for display
   * as HTML. Therefore, SafeMarkup::filterXss() should only be used when the
   * string contains HTML that needs to be rendered properly by the browser.
   *
   * If you need to filter for admin use (like Xss::filterAdmin) then:
   * - If it's used as part of a render array use #markup to allow the render
   *   system to filter by the admin tag list automatically.
   * - Otherwise, use the SafeMarkup::xssFilter() with tag list provided by
   *   Xss::getAdminTagList() instead. If the caller does not want to filter
   *   strings that are marked safe already it needs to check
   *   SafeMarkup::isSafe() itself.

Status: Needs review » Needs work

The last submitted patch, 69: 2506195.69.patch, failed testing.

alexpott’s picture

Status: Needs work » Needs review
FileSize
2.48 KB
38.77 KB

Thank you Drupal\comment\Tests\CommentAdminTest::testCommentAdmin! Entity lists are tables - so need to use ['data']['#markup'].

alexpott’s picture

FileSize
2.8 KB
38.88 KB

Some improvements to the docs for SafeMarkup::filterXss()

Fabianx’s picture

  1. +++ b/core/includes/theme.inc
    @@ -1350,7 +1350,7 @@ function template_preprocess_page(&$variables) {
    -  $variables['site_slogan']       = (theme_get_setting('features.slogan') ? Xss::filterAdmin($site_config->get('slogan')) : '');
    +  $variables['site_slogan']['#markup'] = (theme_get_setting('features.slogan') ? $site_config->get('slogan') : '');
    

    Nice!

  2. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -139,19 +139,71 @@ public static function escape($string) {
    +   * @deprecated as of Drupal 8.0.x, will be removed before Drupal 8.0.0. Most
    +   *   calls can be removed. If it's used as part of a render array use #markup
    +   *   to allow the render system to filter automatically or just allow the Twig
    +   *   to autoescape the value. If the result is not being used directly in the
    +   *   rendering system (for example, when its result is being combined with
    +   *   other strings before rendering), use Xss::filterAdmin(). Otherwise, use
    +   *   SafeMarkup::xssFilter() and the tag list provided by
    +   *   Xss::getAdminTagList() instead. If the caller does not want to filter
    +   *   strings that are marked safe already it needs to check
    +   *   SafeMarkup::isSafe() itself.
    

    This is wonderful!

    I hope we will have a place for these docs when the method gets removed as it is very very helpful.

  3. +++ b/core/lib/Drupal/Component/Utility/Xss.php
    @@ -103,6 +108,13 @@ public static function filter($string, $html_tags = array('a', 'em', 'strong', '
    +   * This method is preferred to
    +   * \Drupal\Component\Utility\SafeMarkup::xssFilter() when the result is
    +   * not being used directly in the rendering system (for example, when its
    +   * result is being combined with other strings before rendering). This avoids
    +   * bloating the safe string list with partial strings if the whole result will
    +   * be marked safe.
    

    This are very helpful docs.

  4. +++ b/core/lib/Drupal/Component/Utility/Xss.php
    @@ -319,4 +335,14 @@ protected static function needsRemoval($html_tags, $elem) {
    +  public static function getAdminTagList() {
    +    return static::$adminTags;
    +  }
    

    That is a clever way to expose that data.

  5. +++ b/core/lib/Drupal/Core/Render/Element/HtmlTag.php
    @@ -50,7 +51,7 @@ public function getInfo() {
    -   * \Drupal\Component\Utility\SafeMarkup::checkAdminXss() to regard it as safe
    +   * \Drupal\Core\Render\Renderer::xssFilterAdminIfUnsafe() to regard it as safe
    

    That method name is so much easier to understand.

  6. +++ b/core/lib/Drupal/Core/Render/Renderer.php
    @@ -651,4 +651,23 @@ public function addCacheableDependency(array &$elements, $dependency) {
    +  protected function xssFilterAdminIfUnsafe($string) {
    +    // @todo https://www.drupal.org/node/2506581 replace with
    +    //   SafeMarkup::isSafe() and Xss::filterAdmin().
    +    return SafeMarkup::checkAdminXss($string);
    +  }
    

    Uh, is there any reason not to call isSafe() and call Xss::filterAdmin() here directly?

    I agree though it can be follow-up.

  7. +++ b/core/modules/forum/forum.module
    @@ -540,7 +540,8 @@ function template_preprocess_forum_list(&$variables) {
    +    // #markup is filtered for admin XSS automatically.
    +    $variables['forums'][$id]->description = array('#markup' => $forum->description->value);
    

    I love the comments explaining that.

    Also makes me feel it was the right decision to filter #markup.

  8. +++ b/core/modules/menu_ui/src/MenuListBuilder.php
    @@ -39,7 +39,8 @@ public function buildRow(EntityInterface $entity) {
    +    // #markup is filtered for admin XSS automatically.
    
    +++ b/core/modules/node/node.module
    @@ -508,7 +509,10 @@ function template_preprocess_node_add_list(&$variables) {
    +          // The description will XSS filtered by the render system.
    +          '#markup' => $type->getDescription(),
    

    nit - can be fixed on commit:

    Can we use the same:

    // #markup is filtered for admin XSS automatically.

    here?

  9. +++ b/core/modules/node/src/Controller/NodeController.php
    @@ -195,7 +194,7 @@ public function revisionOverview(NodeInterface $node) {
    -            'message' => Xss::filter($revision->revision_log->value),
    +            'message' => SafeMarkup::xssFilter($revision->revision_log->value),
    

    Also again a general +1 to keeping subsystems clean of each other.

  10. +++ b/core/modules/system/src/Plugin/Block/SystemBrandingBlock.php
    @@ -173,7 +174,7 @@ public function build() {
    -      '#markup' => Xss::filterAdmin($site_config->get('slogan')),
    +      '#markup' => $site_config->get('slogan'),
    

    nit - could add the comment here as well.

  11. +++ b/core/modules/system/system.admin.inc
    @@ -58,7 +58,7 @@ function template_preprocess_admin_block_content(&$variables) {
    -        $variables['content'][$key]['description'] = Xss::filterAdmin($item['description']);
    +        $variables['content'][$key]['description'] = ['#markup' => $item['description']];
    

    nit - could add the comment here as well.

  12. +++ b/core/modules/views/src/Plugin/Block/ViewsBlock.php
    @@ -32,7 +32,9 @@ public function build() {
    +        // The title will be autoescaped by Twig so it is not necessary to
    +        // XSS filter it.
    +        $output['#title'] = $this->view->getTitle();
    
    +++ b/core/modules/views/src/Plugin/views/display/Page.php
    @@ -181,7 +181,8 @@ public function execute() {
    +        // #title will be auto escaped by Twig.
    +        '#title' => $this->view->getTitle(),
    

    Thanks for great comments!

  13. +++ b/core/modules/views_ui/views_ui.module
    @@ -129,7 +129,8 @@ function views_ui_preprocess_views_view(&$variables) {
    +    // #markup is filtered for admin XSS automatically.
    

    Maybe $variables['title'] = [ '#markup' => ... ] is better.

    Also:

    Once we auto-escape, once we don't.

    Feels a little inconsistent.

  14. +++ b/core/tests/Drupal/Tests/Component/Utility/SafeMarkupTest.php
    @@ -213,6 +214,30 @@ public function testReplace($search, $replace, $subject, $expected, $is_safe) {
    +  public function testAdminXss() {
    +    // Use the predefined XSS admin tag list. This strips the <marquee> tags.
    +    $this->assertEquals('text', SafeMarkup::xssFilter('<marquee>text</marquee>', Xss::getAdminTagList()));
    +
    +    // This won't strip the <marquee> tags and the string with html will be
    +    // marked as safe.
    +    $filtered = SafeMarkup::xssFilter('<marquee>text</marquee>', array('marquee'));
    +    $this->assertEquals('<marquee>text</marquee>', $filtered);
    +    $this->assertTrue(SafeMarkup::isSafe('<marquee>text</marquee>'), 'The string \'<marquee>text</marquee>\' is marked as safe.');
    +
    +    // The default tag list strips the <marquee> tags even though the string is
    +    // marked as safe.
    +    $this->assertEquals('text', SafeMarkup::xssFilter('<marquee>text</marquee>'));
    +
    +    // This won't escape the <marquee> tags since it is marked as safe.
    +    $this->assertEquals('<marquee>text</marquee>', SafeMarkup::escape($filtered));
    +  }
    

    This is truly awesome test coverage!

    I now need to lookup why it works ;).

    Oh, yes, because we no longer check if its safe before filtering.

    Yes, that makes a lot of sense!

  15. +++ b/core/themes/bartik/bartik.theme
    @@ -125,6 +125,6 @@ function _bartik_process_page(&$variables) {
    -    $variables['site_slogan'] = Xss::filterAdmin($site_config->get('slogan'));
    +    $variables['site_slogan']['#markup'] = $site_config->get('slogan');
    

    Little concern:

    Did we define / document that site_slogan is a render array here?

    Else it could be made a string by something, then here trying string-to-array conversion.

    I know that, because I myself changed site_slogan to render array for hacking SiteBrandingBlock, but I think = [ '#markup' => ...].

    Might be safer as the variable is as far as I know not explicitly defined as being a string.

    So that feels like the safer variant.

Mostly nits, some comments.

One @todo is marked as @todo and I will defer to alexpott' judgement why he chose to do that in a follow-up issue.

As that one is big already (and there is a dedicated issue for the renderer) however I agree.

--

For now leaving at 'needs review' and awaiting feedback to the review.

It is almost back to RTBC though.

catch’s picture

Overall looks great, a few nits.

  1. +++ b/core/includes/theme.inc
    @@ -1350,7 +1350,7 @@ function template_preprocess_page(&$variables) {
    +  $variables['site_slogan']['#markup'] = (theme_get_setting('features.slogan') ? $site_config->get('slogan') : '');
    

    Do we already set $variables['foo']['#markup'] in core? This might need adding to hook_preprocess() docs - it at least looks strange to me but I see the point.

  2. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -139,19 +139,71 @@ public static function escape($string) {
    +   *
    

    'admin-only use, if not safe' doesn't scan very well for me.

  3. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -139,19 +139,71 @@ public static function escape($string) {
    +   * @deprecated as of Drupal 8.0.x, will be removed before Drupal 8.0.0. Most
    +   *   calls can be removed. If it's used as part of a render array use #markup
    +   *   to allow the render system to filter automatically or just allow the Twig
    +   *   to autoescape the value. If the result is not being used directly in the
    

    'If it's used' - if what is used?

  4. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -139,19 +139,71 @@ public static function escape($string) {
    +   *
    

    Does the caller really need to check isSafe() - the function body also does that.

  5. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -139,19 +139,71 @@ public static function escape($string) {
    +   * If twig receives a value that is not marked as safe then it will
    

    Twig

  6. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -139,19 +139,71 @@ public static function escape($string) {
    +   * - If it's used as part of a render array use #markup to allow the render
    

    "it's" is ambiguous again.

  7. +++ b/core/modules/node/node.module
    @@ -9,6 +9,7 @@
    +use Drupal\Component\Utility\SafeMarkup;
    

    Not used?

  8. +++ b/core/modules/node/node.module
    @@ -9,6 +9,7 @@
     use Drupal\Component\Utility\Xss;
    

    Is this still used?

  9. +++ b/core/lib/Drupal/Component/Utility/Xss.php
    @@ -29,14 +29,19 @@ class Xss {
    +   * This method is preferred to
    +   * \Drupal\Component\Utility\SafeMarkup::xssFilter() when the result is not
    +   * being used directly in the rendering system (for example, when its result
    +   * is being combined with other strings before rendering). This avoids
    +   * bloating the safe string list with partial strings if the whole result will
    +   * be marked safe.
    +   *
    

    This is correct but I'm not sure people reading it will grasp the difference. Can improve later though.

  10. +++ b/core/modules/block_content/block_content.pages.inc
    @@ -28,7 +28,10 @@ function template_preprocess_block_content_add_list(&$variables) {
    -      'description' => Xss::filterAdmin($type->getDescription()),
    

    Is there a corresponding use statement that can be dropped?

  11. +++ b/core/modules/block_content/src/BlockContentTypeListBuilder.php
    @@ -45,7 +45,7 @@ public function buildHeader() {
    +    $row['description']['data']['#markup'] = $entity->getDescription();
    

    Same question.

  12. +++ b/core/modules/filter/filter.module
    @@ -430,7 +434,6 @@ function template_preprocess_filter_tips(&$variables) {
    -      $tiplist[$tip_key]['tip'] = Xss::filterAdmin($tiplist[$tip_key]['tip']);
    

    Is this OK?

  13. +++ b/core/modules/node/src/NodeTypeListBuilder.php
    @@ -7,10 +7,10 @@
    +use Drupal\Component\Utility\SafeMarkup;
    

    not used?

alexpott’s picture

FileSize
4.41 KB
39.03 KB

This is re #73 / @FabianX

  1. :)
  2. In the patch in #72 I tried to incorporated all of this in SafeMarkup::filterXss()
  3. :)
  4. :)
  5. :)
  6. Because #2506581: Remove SafeMarkup::set() from Renderer::doRender will be able to complete remove the adding of #markup to the safe string list.
  7. :)
  8. Fixed
  9. :)
  10. Fixed
  11. Fixed
  12. :)
  13. Fixed
  14. Yep that's thanks to @David_Rothstein for his pseudo code.
  15. Changed everything to be = ['#markup' =>$blah]
David_Rothstein’s picture

   $variables['site_name']         = (theme_get_setting('features.name') ? SafeMarkup::checkPlain($site_config->get('name')) : '');
-  $variables['site_slogan']       = (theme_get_setting('features.slogan') ? Xss::filterAdmin($site_config->get('slogan')) : '');
+  $variables['site_slogan']['#markup'] = (theme_get_setting('features.slogan') ? $site_config->get('slogan') : '');

I'm not sure this kind of thing is a great idea. Shouldn't the code that knows how it wants to sanitize something just sanitize it directly (as is the case with the checkPlain in the line immediately above this)?

Relying on code elsewhere to do this makes it harder for someone reading the code to understand what the intention is - in this case, that the site name was entered as plain text but the slogan was entered as HTML.

catch’s picture

I'm not sure this kind of thing is a great idea. Shouldn't the code that knows how it wants to sanitize something just sanitize it directly (as is the case with the checkPlain in the line immediately above this)?

On the other hand sanitizing on render means that templates which don't print variables prepared in preprocess don't do the work at all. This starts to add up for comments/fields, not so much for this example.

Fabianx’s picture

#76 Actually those all should be render arrays by now - but in twig it is really transparent to what is used anyway.

alexpott’s picture

Re #76 actually the checkPlain is completely pointless there since twig would auto-escape that.

alexpott’s picture

FileSize
6.22 KB
39.23 KB

re #74/@catch

  1. This is already documented in theme.api.php - and it reads really well already :)
  2. Fixed
  3. Fixed
  4. Yes once checkAdminXss is removed anything that calls SafeMarkup::xssFilter() and does not want to filter safe strings will need to care about this. However this should be extremely rare. I've changed the docs to say that this should be rare.
  5. Fixed
  6. Fixed
  7. Fixed
  8. Yes it is still used.
  9. Open to suggestions
  10. Yes, fixed
  11. Yes, fixed
  12. Yes... that's done here
             if (isset($tip)) {
    -          $tips[$format->label()][$name] = array('tip' => $tip, 'id' => $name);
    +          $tips[$format->label()][$name] = array(
    +            // #markup is filtered for admin XSS automatically.
    +            'tip' => array('#markup' => $tip),
    +            'id' => $name
    +          );
    
  13. Fixed
Fabianx’s picture

Status: Needs review » Reviewed & tested by the community

All feedback has been addressed, so happy to RTBC this!

catch’s picture

#9 I couldn't think of anything better, so yes let's leave like that and improve later if someone thinks of something.

About to head out but will try to remember to commit once back.

xjm’s picture

Assigned: Unassigned » xjm

Posting notes from a partial review -- need to come back to this after. Assigning to myself.

  1. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -15,9 +15,9 @@
    + * created from render arrays via drupal_render().
    

    It occurs to me that a part of #2393329: Replace all drupal_render calls with the service, and inject it, if possible. that we should do now is replacing mentions of drupal_render() in docs with the appropriate method on the renderer. Not introduced here.

  2. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,71 @@ public static function escape($string) {
    +   * Note: this methods only filters if $string is not marked safe already.
    

    Nit: "this methods"

  3. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,71 @@ public static function escape($string) {
    +   * @deprecated as of Drupal 8.0.x, will be removed before Drupal 8.0.0. Most
    

    Overall the deprecation note is much clearer now.

    I think this needs to say it will be removed before 9.0.0 or such? It's too late to deprecate something for 8.0.0 except under very exceptional circumstances, and I don't think this is one.

    Also "most calls can be removed" with "@deprecated" is a little confusing. @deprecated implies that all calls can be removed if we've done it properly...

  4. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,71 @@ public static function escape($string) {
    +   *   calls can be removed. If the string used as part of a render array use
    +   *   #markup to allow the render system to filter automatically or just allow
    

    The user reading this doesn't necessarily know what a render array is or what #markup means. I think we could add an @see to the render/theme API doc group or the sanitization functions doc groups... which should also probably be the canonical place that we explain this concept.

  5. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,71 @@ public static function escape($string) {
    +   *   the Twig to autoescape the value. If the result is not being used
    +   *   directly in the rendering system (for example, when its result is being
    +   *   combined with other strings before rendering), use Xss::filterAdmin().
    

    It's unclear in the sentence whether the parenthetical is an example of something being used directly in the rendering system or if it's an example of something not being used in the rendering system.

    Also, not sure about referring to "Twig" here the way that we do with no other references... though we do provide a link in the class docblocks so maybe that is okay, or something we can iterate on in a followup.

  6. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,71 @@ public static function escape($string) {
    +   * @see \Drupal\Component\Utility\SafeMarkup::xssFilter()
    +   * @see \Drupal\Component\Utility\SafeMarkup::isSafe()
    +   * @see \Drupal\Component\Utility\Xss::filterAdmin()
    +   * @see \Drupal\Component\Utility\Xss::getAdminTagList()
    

    +1 in general. I think we could remove the @see to getAdminTagList and isSafe() though. I can do that on commit if you are cool with it.

  7. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,71 @@ public static function escape($string) {
    +   * Filters HTML for XSS vulnerabilities and marks the result as safe.
    

    Overall, the documentation for this method is excellent now.

    I wonder if we should pull an overview covering these things on the @defgroup documentation. That can be a followup though.

  8. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,71 @@ public static function escape($string) {
    +   * - If the string is used as part of a render array, use #markup to allow the
    +   *   render system to filter by the admin tag list automatically.
    

    Same note about "what is a render array". A reference or link could help with that.

David_Rothstein’s picture

Assigned: xjm » Unassigned

Re #76 actually the checkPlain is completely pointless there since twig would auto-escape that.

Sure, but is there going to be a followup issue to convert all of those too? Right now the patch leaves things in a very inconsistent state.

And if they are all converted we'd have gone from this:

  $variables['site_name']         = (theme_get_setting('features.name') ? SafeMarkup::checkPlain($site_config->get('name')) : '');
  $variables['site_slogan']       = (theme_get_setting('features.slogan') ? Xss::filterAdmin($site_config->get('slogan')) : '');

to this:

  $variables['site_name']         = (theme_get_setting('features.name') ? $site_config->get('name') : '');
  $variables['site_slogan']['#markup'] = (theme_get_setting('features.slogan') ? $site_config->get('slogan') : '');

which still seems confusing to me in terms of what's HTML and what's not (although I guess I could get used to the idea that #markup is HTML by default).

But the patch currently labels most (but not all) of these instances with a comment like this:

// #markup is filtered for admin XSS automatically

If it's confusing enough that we feel the need to label all these, that doesn't sound great.

David_Rothstein’s picture

Assigned: Unassigned » xjm

Crosspost.

David_Rothstein’s picture

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 80: 2506195.80.patch, failed testing.

alexpott’s picture

Status: Needs work » Needs review
FileSize
4.28 KB
39.97 KB

Fixing the test fail - I'm not entirely sure why we're mangling stuff in system_preprocess_block().

re #83

  1. Fixed - linked to the render API docs.
  2. Fixed
  3. Yep removed the "most calls can be removed". The reason I deprecated for 8.0 is that I think narrowing the scope of the security focused functions makes things simpler for developers. Also this method breaks the idempotence of Xss filter methods because whether it filters or not depends on whether the string is already marked safe. Another factor is that I asked @berdir to grep contrib on his d8 projects and he found no usages.
  4. Fixed - linked to the render API docs.
  5. Yep let's iterate is a followup that takes a holistic look at the string santization docs. I've ensured that SafeMarkup::xssFilter and Xss::filterAdmin are listed on https://api.drupal.org/api/drupal/core%21includes%21common.inc/group/san...
  6. I did that because the above docs use the shorthand notation of SafeMarkup::isSafe() and Xss::getAdminTagList() - I thought that this is what we're supposed to do.
  7. Yep let's do that in in the followup mentioned in point 5
  8. Fixed - linked to the render API docs.
Fabianx’s picture

Status: Needs review » Reviewed & tested by the community

Interdiff looks great, back to RTBC.

Agree with #88.3: When I dealt with SafeMarkup issues there was a lot of confusion (partially also because one must learn to get into a mindset of autoescaping first), so anything we can do to reduce that confusion - even if it was breaking things, is a good thing IMHO.

And checkAdminXss was not introduced too long ago, lets just ensure to update the old CRs, please.

#88: Could you address #84?

Leaving assigned to xjm.

alexpott’s picture

From #84

although I guess I could get used to the idea that #markup is HTML by default

This is exactly the point #markup is treated as html by default. It might be worth us expanding the render api properties so we can specify a list of allowed tags here as well.

In Drupal 8 we have auto-filtering and auto-escaping both of which we should be relying on more so that if a template does not print something we don't waste effort escaping.

joelpittet’s picture

FileSize
3.62 KB
39.98 KB

This patch is nitpicks fixes only, leaving RTBC.

Though I have a couple questions:

  1. +++ b/core/themes/seven/seven.theme
    @@ -75,7 +75,6 @@ function seven_preprocess_node_add_list(&$variables) {
    -      $variables['types'][$type->id()]['description'] = Xss::filterAdmin($type->getDescription());
    
    @@ -85,13 +84,12 @@ function seven_preprocess_node_add_list(&$variables) {
    - * separate variables for the label, description, and url.
    + * separate variables for the label and url.
    ...
    -      $variables['types'][$type->id()]['description'] = Xss::filterAdmin($type->getDescription());
    

    Why are the descriptions removed here in the seven theme?

  2. +++ b/core/modules/system/src/Plugin/Block/SystemBrandingBlock.php
    @@ -173,7 +174,8 @@ public function build() {
    +      // #markup is filtered for admin XSS automatically.
    
    +++ b/core/modules/system/system.admin.inc
    @@ -58,7 +58,8 @@ function template_preprocess_admin_block_content(&$variables) {
    +        // #markup is filtered for admin XSS automatically.
    
    +++ b/core/modules/views/src/Plugin/Block/ViewsBlock.php
    @@ -32,7 +32,9 @@ public function build() {
    +        // The title will be autoescaped by Twig so it is not necessary to
    +        // XSS filter it.
    
    +++ b/core/modules/views/src/Plugin/views/display/Page.php
    @@ -181,7 +181,8 @@ public function execute() {
    +        // #title will be auto escaped by Twig.
    

    Do we really need to litter the code with these comments, and if not do you mind if I remove them for you?

xjm’s picture

@joelpittet:

For #91.1, I asked @alexpott and he pointed me to template_preprocess_node_add_list() in node.module and template_preprocess_block_add_list() in block_content.pages.inc, both of which already add the descriptions, so no need to add it twice.

For #91.2, I actually think having inline comments like these is valuable in these cases.

Thanks for fixing the nitpicks; those had been bothering me awhile. :P

xjm’s picture

So re: #88.3, I updated the change record to say that it will be removed:
https://www.drupal.org/node/2506757/revisions/view/8635184/8657360

The other change records also needs an update still:
https://www.drupal.org/node/2296163

xjm’s picture

Also, not blocking for this issue, but we should merge https://www.drupal.org/node/2311123 in with the main change record and update it to mention render arrays. See #2494297: [no patch] Consolidate change records relating to safe markup and filtering/escaping to ensure cross references exist. I'll comment over there.

xjm’s picture

Title: Remove SafeMarkup::set() from XSS::filter() » Remove SafeMarkup::set() from Xss::filter()
xjm’s picture

I also broadened the scope of #2526448: Improve MarkupInterface documentation and render/theme/sanitization API documentation to include updating the theme/render/sanitization docs.

xjm’s picture

Another incomplete review, sorry...

  1. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,74 @@ public static function escape($string) {
    +   * list and increases the chance of unintended side-effects.
    

    Very nitty: no hyphen is needed for "side effects".

  2. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,74 @@ public static function escape($string) {
    +   *   use #markup to allow the render system to filter by the admin tag list
    

    Better would be:

    ..use a '#markup' element...

  3. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,74 @@ public static function escape($string) {
    +   * - Otherwise, use the SafeMarkup::xssFilter() with tag list provided by
    

    "with the tag list provided by..."

  4. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,74 @@ public static function escape($string) {
    +   * This method should only be used instead of Xss::filter() when the result is
    +   * being added to a render array that is constructed before rendering begins.
    

    Hm, so this is a bit confusing still, especially when contrasted by the examples for filtering with the admin list specifically above. Is there really never a case to use SafeMarkup::xssFilter() outside of a render array?

  5. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,74 @@ public static function escape($string) {
    +   *   An XSS safe version of $string, or an empty string if $string is not
    

    Nit: "XSS-safe"

  6. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,74 @@ public static function escape($string) {
    +  public static function xssFilter($string, $html_tags = array('a', 'em', 'strong', 'cite', 'blockquote', 'code', 'ul', 'ol', 'li', 'dl', 'dt', 'dd')) {
    

    Could be a followup: Since we have Xss::getAdminTagList(), should we also provide Xss::getDefaultTagList()? Then we could apply that tag list here if the second argument is empty instead of needing to keep the default in synch with the one in Xss. The only question there is if the caller wants to pass an explicitly empty tag list, but then they should just use checkPlain() TBH.

  7. +++ b/core/lib/Drupal/Component/Utility/Xss.php
    @@ -29,14 +29,19 @@ class Xss {
    +   * being used directly in the rendering system (for example, when its result
    +   * is being combined with other strings before rendering). This avoids
    

    I still think a code snippet for "combined with other strings before rendering" would be helpful to grok what this means.

  8. +++ b/core/lib/Drupal/Component/Utility/Xss.php
    @@ -83,7 +88,7 @@ public static function filter($string, $html_tags = array('a', 'em', 'strong', '
    -    return SafeMarkup::set(preg_replace_callback('%
    +    return preg_replace_callback('%
    

    Did I say "yay" yet?

xjm’s picture

Assigned: xjm » Unassigned
Status: Reviewed & tested by the community » Needs work

Alright, finally finished my review:

  1. +++ b/core/lib/Drupal/Core/Render/Element/HtmlTag.php
    @@ -161,7 +162,7 @@ public static function preRenderConditionalComments($element) {
    -      $expression = SafeMarkup::checkAdminXss($browsers['IE']);
    +      $expression = Xss::filterAdmin($browsers['IE']);
    
    @@ -173,8 +174,8 @@ public static function preRenderConditionalComments($element) {
         // Ensure what we are dealing with is safe.
         // This would be done later anyway in drupal_render().
    -    $prefix = isset($elements['#prefix']) ? SafeMarkup::checkAdminXss($elements['#prefix']) : '';
    -    $suffix = isset($elements['#suffix']) ? SafeMarkup::checkAdminXss($elements['#suffix']) : '';
    +    $prefix = isset($elements['#prefix']) ? Xss::filterAdmin($elements['#prefix']) : '';
    +    $suffix = isset($elements['#suffix']) ? Xss::filterAdmin($elements['#suffix']) : '';
     
         // Now calling SafeMarkup::set is safe, because we ensured the
         // data coming in was at least admin escaped.
    

    Note that the SafeMarkup use is being removed here because there is a SafeMarkup::set() being done on the full result anyway, so there is no reason to pollute the string list with partial sub-strings.

    That said, the inline comment "Ensure what we are dealing with is safe" seems a bit off now; it should probably say "sanitized for XSS" or such.

    Note that I don't quite grok the "this would be done later anyway in drupal_render()" comment either before or after this patch -- like if drupal_render() is going to do this then why do we need to here? But that might be something for the followup about checking usages.

  2. +++ b/core/lib/Drupal/Core/Render/Renderer.php
    @@ -651,4 +651,23 @@ public function addCacheableDependency(array &$elements, $dependency) {
    +    // @todo https://www.drupal.org/node/2506581 replace with
    +    //   SafeMarkup::isSafe() and Xss::filterAdmin().
    +    return SafeMarkup::checkAdminXss($string);
    

    It seemed weird to me to be adding a use of the function we specifically say we're removing, but a reason that is not entirely clear to me @alexpott says it will be easier to leave this as-is and then remove it in the doRender() issue referenced in the inline comment, which is also a blocking part of the critical and therefore will be done before 8.0.0 anyway.

  3. +++ b/core/modules/aggregator/aggregator.module
    @@ -166,7 +166,7 @@ function aggregator_cron() {
    -  return Xss::filter($value, preg_split('/\s+|<|>/', \Drupal::config('aggregator.settings')->get('items.allowed_html'), -1, PREG_SPLIT_NO_EMPTY));
    +  return SafeMarkup::xssFilter($value, preg_split('/\s+|<|>/', \Drupal::config('aggregator.settings')->get('items.allowed_html'), -1, PREG_SPLIT_NO_EMPTY));
    

    So I checked the usages of this and both were being added to #markup elements, which made me wonder why this usage was here. The reason for this is that the allowed tag list is user-configurable and therefore we need to do the filtering here and also mark the string as safe so that it is not re-sanitized.

  4. +++ b/core/modules/aggregator/src/Controller/AggregatorController.php
    @@ -187,7 +187,7 @@ public function pageLast() {
    -    return Xss::filter($aggregator_feed->label());
    +    return SafeMarkup::xssFilter($aggregator_feed->label());
    
    +++ b/core/modules/block_content/block_content.pages.inc
    @@ -28,7 +27,10 @@ function template_preprocess_block_content_add_list(&$variables) {
    +      'description' => array(
    +        // #markup is filtered for admin XSS automatically.
    +        '#markup' => $type->getDescription(),
    +      ),
    
    +++ b/core/modules/block_content/src/BlockContentTypeListBuilder.php
    @@ -45,7 +44,7 @@ public function buildHeader() {
    -    $row['description'] = Xss::filterAdmin($entity->getDescription());
    +    $row['description']['data']['#markup'] = $entity->getDescription();
    
    +++ b/core/modules/comment/src/CommentTypeListBuilder.php
    @@ -46,7 +46,8 @@ public function buildHeader() {
    +    // #markup is filtered for admin XSS automatically.
    +    $row['description']['data'] = ['#markup' => $entity->getDescription()];
    
    +++ b/core/modules/forum/forum.module
    @@ -540,7 +540,8 @@ function template_preprocess_forum_list(&$variables) {
    +    // #markup is filtered for admin XSS automatically.
    +    $variables['forums'][$id]->description = array('#markup' => $forum->description->value);
    
    +++ b/core/modules/menu_ui/src/Controller/MenuController.php
    @@ -77,7 +77,7 @@ public function getParentOptions(Request $request) {
    -    return Xss::filter($menu->label());
    +    return SafeMarkup::xssFilter($menu->label());
    
    +++ b/core/modules/menu_ui/src/MenuListBuilder.php
    @@ -39,7 +39,8 @@ public function buildRow(EntityInterface $entity) {
    -    $row['description'] = Xss::filterAdmin($entity->getDescription());
    +    // #markup is filtered for admin XSS automatically.
    +    $row['description']['data'] = ['#markup' => $entity->getDescription()];
    
    +++ b/core/modules/node/node.module
    @@ -508,7 +508,10 @@ function template_preprocess_node_add_list(&$variables) {
    +        'description' => array(
    +          // #markup is filtered for admin XSS automatically.
    +          '#markup' => $type->getDescription(),
    
    +++ b/core/modules/node/src/Controller/NodeController.php
    @@ -195,7 +194,7 @@ public function revisionOverview(NodeInterface $node) {
    -            'message' => Xss::filter($revision->revision_log->value),
    +            'message' => SafeMarkup::xssFilter($revision->revision_log->value),
    
    +++ b/core/modules/node/src/NodeTypeListBuilder.php
    @@ -39,7 +38,8 @@ public function buildRow(EntityInterface $entity) {
    -    $row['description'] = Xss::filterAdmin($entity->getDescription());
    +    // #markup is filtered for admin XSS automatically.
    +    $row['description']['data'] = ['#markup' => $entity->getDescription()];
    
    +++ b/core/modules/taxonomy/src/Controller/TaxonomyController.php
    @@ -54,7 +55,7 @@ public function addForm(VocabularyInterface $taxonomy_vocabulary) {
    -    return Xss::filter($taxonomy_vocabulary->label());
    +    return SafeMarkup::xssFilter($taxonomy_vocabulary->label());
    
    @@ -67,7 +68,7 @@ public function vocabularyTitle(VocabularyInterface $taxonomy_vocabulary) {
    -    return Xss::filter($taxonomy_term->getName());
    +    return SafeMarkup::xssFilter($taxonomy_term->getName());
    
    +++ b/core/modules/user/src/Controller/UserController.php
    @@ -161,7 +162,7 @@ public function userPage() {
    -    return $user ? Xss::filter($user->getUsername()) : '';
    +    return $user ? SafeMarkup::xssFilter($user->getUsername()) : '';
    

    It's entertaining that we are sanitizing entity labels and descriptions in inconsistent ways in controllers, preprocesses, etc. This could probably be a specific child issue of #2527360: Review all usages of Xss::filter(), Xss::filterAdmin(), and Html::escape().

  5. +++ b/core/modules/views/src/Plugin/Block/ViewsBlock.php
    @@ -7,8 +7,8 @@
    +use Drupal\Component\Utility\SafeMarkup;
     use Drupal\Core\Config\Entity\Query\Query;
    -use Drupal\Component\Utility\Xss;
    
    +++ b/core/modules/views/src/Plugin/views/display/Page.php
    @@ -7,7 +7,7 @@
    -use Drupal\Component\Utility\Xss;
    +use Drupal\Component\Utility\SafeMarkup;
    

    Note: These are adding unused use statements in the current patch.

  6. +++ b/core/modules/views/src/Plugin/Block/ViewsBlock.php
    @@ -32,7 +32,9 @@ public function build() {
    -        $output['#title'] = Xss::filterAdmin($this->view->getTitle());
    +        // The title will be autoescaped by Twig so it is not necessary to
    +        // XSS filter it.
    +        $output['#title'] = $this->view->getTitle();
    
    +++ b/core/modules/views/src/Plugin/views/display/Page.php
    @@ -181,7 +181,8 @@ public function execute() {
    -        '#title' => Xss::filterAdmin($this->view->getTitle()),
    +        // #title will be auto escaped by Twig.
    +        '#title' => $this->view->getTitle(),
    

    I'm concerned that these lines are changing the existing behavior in HEAD from a very permissive filter to a full checkPlain() which seems fairly likely to introduce escaping bugs for existing views.

    @alexpott explained that this was because (a) #title means we (I think?) can't use #markup and (b) Using SafeMarkup::filter() here would be inconsistent with the documentation.

    However, I think I'd rather use SafeMarkup::filter() potentially with a @todo/followup than risk introducing a regression in views titles.

  7. +++ b/core/modules/views_ui/views_ui.module
    @@ -129,7 +129,9 @@ function views_ui_preprocess_views_view(&$variables) {
    -    $variables['title'] = Xss::filterAdmin($view->getTitle());
    +    // The title will be autoescaped by Twig so it is not necessary to XSS
    +    // filter it.
    +    $variables['title'] = $view->getTitle();
    

    When we change the above back to using the admin tag list, we should also simply change this to be
    $variables['title']['#markup']

  8. +++ b/core/tests/Drupal/Tests/Component/Utility/SafeMarkupTest.php
    @@ -8,6 +8,7 @@
    +use Drupal\Component\Utility\Xss;
    

    I think this is an unused use?

  9. +++ b/core/tests/Drupal/Tests/Component/Utility/SafeMarkupTest.php
    @@ -213,6 +214,30 @@ public function testReplace($search, $replace, $subject, $expected, $is_safe) {
    +    // Use the predefined XSS admin tag list. This strips the <marquee> tags.
    +    $this->assertEquals('text', SafeMarkup::xssFilter('<marquee>text</marquee>', Xss::getAdminTagList()));
    

    Should this also have the isSafe() assertion that the next hunk has?

  10. +++ b/core/tests/Drupal/Tests/Component/Utility/SafeMarkupTest.php
    @@ -213,6 +214,30 @@ public function testReplace($search, $replace, $subject, $expected, $is_safe) {
    +    // This won't strip the <marquee> tags and the string with html will be
    +    // marked as safe.
    

    Nit: HTML in caps.

  11. +++ b/core/tests/Drupal/Tests/Component/Utility/SafeMarkupTest.php
    @@ -213,6 +214,30 @@ public function testReplace($search, $replace, $subject, $expected, $is_safe) {
    +    // The default tag list strips the <marquee> tags even though the string is
    +    // marked as safe.
    +    $this->assertEquals('text', SafeMarkup::xssFilter('<marquee>text</marquee>'));
    

    How about: "SafeMarkup::xssFilter() with the default tag list will strip the <marquee> tag even though the string was marked safe above."

  12. +++ b/core/tests/Drupal/Tests/Component/Utility/SafeMarkupTest.php
    @@ -213,6 +214,30 @@ public function testReplace($search, $replace, $subject, $expected, $is_safe) {
    +    // This won't escape the <marquee> tags since it is marked as safe.
    +    $this->assertEquals('<marquee>text</marquee>', SafeMarkup::escape($filtered));
    

    And then "SafeMarkup::escape() will not escape the markup tag since the string was marked safe above."

alexpott’s picture

Thanks @xjm - we're !.

re #97

  1. Fixed
  2. Fixed
  3. Fixed
  4. There is never any point in using SafeMarkup::anyMethod() on something that can not end up in a render array.
  5. Fixed
  6. Filtering and escaping are not the same thing. Xss::filter with an empty array is strip_tags(). That said excellent point about having to maintain two lists of default tags for both Xss::filter and SafeMarkup::xssFilter. Patch attached does things in a slightly different way so we don't do this.
  7. @xjm and I agreed to do this in #2526448: Improve MarkupInterface documentation and render/theme/sanitization API documentation
  8. No not enough yay - only yay once this is in :P

re #98

  1. This all will be addressed in #2296101: Remove SafeMarkup::set() use in \Drupal\Core\Render\Element\HtmlTag::preRenderHtmlTag()
  2. ok
  3. ok
  4. ok
  5. Fixed kinda - both are now due to... point 6...
  6. Changed back and added @todo
  7. Fixed
  8. Nope we use Xss::getAdminTagList()
  9. Sure
  10. Fixed
  11. Yep
  12. Yep and this made me want to add a SafeMarkup::checkPlain() test so I did.

Status: Needs review » Needs work

The last submitted patch, 99: 2506195.99.patch, failed testing.

Status: Needs work » Needs review

Mixologic queued 99: 2506195.99.patch for re-testing.

Fabianx’s picture

Status: Needs review » Reviewed & tested by the community

All changes look good, back to RTBC!

David_Rothstein’s picture

Status: Reviewed & tested by the community » Needs review
  1. Both @joelpittet and I pointed out that this adds the "#markup is filtered for admin XSS automatically" code comment a huge number of times...

    It's not sustainable that people will remember to add this every time #markup is used (in fact, I pointed out in #84 that even this patch doesn't add it consistently) so why use it in so many different random places?

    I think if you're going to make the change this patch proposes you need to own it - the code should speak for itself and a pattern that is used this often shouldn't need a code comment.

  2. Speaking of which, I'm less and less convinced that the pattern is a good idea. There's a big difference between using auto-escaping and auto-filtering as a last-ditch fallback for security, vs. relying on it as a primary mechanism. It's basically telling developers not to think about filtering their own data, which is a lesson that doesn't translate anywhere outside the rendering system...

    Unless someone can point out that this is a pattern Drupal 8 is already using, can we defer that to #2527360: Review all usages of Xss::filter(), Xss::filterAdmin(), and Html::escape() instead? It's not even related to this issue.

  3. The only reason it was partially related was to avoid introducing SafeMarkup::xssFilterAdmin(). However I think it actually does make sense to add that function (like the earlier patches did) even if it turns out core never uses it. It keeps things parallel with Xss::filterAdmin(), for one thing.

    Put another way, the patch has an entire paragraph of code comments explaining what you should do as an alternative if you find yourself wanting to call this nonexistent function:

    +   * If you need to filter for admin use, like Xss::filterAdmin(), then:
    +   * - If the string is used as part of a @link theme_render render array @endlink,
    +   *   use #markup to allow the render system to filter by the admin tag list
    +   *   automatically.
    +   * - Otherwise, use the SafeMarkup::xssFilter() with tag list provided by
    +   *   Xss::getAdminTagList() instead.
    

    That's a pretty good sign that the function should just be there :)

    And the patch is nostalgic for it too - still references it in at least one place:

    + * Strings sanitized by self::checkPlain(), self::xssFilter() or
    + * self::xssFilterAdmin() are automatically marked safe, as are markup strings
    + * created from @link theme_render render arrays @endlink via drupal_render().
    
xjm’s picture

I disagree with #103 points 1 and 2. The full intent of the original Twig autoescape issue in #1825952: Turn on twig autoescape by default was that we would switch from not escaping by default to escaping by default -- not as a "last-ditch fallback" but as our primary sanitization functionality. See the original issue summary from @catch on that issue:

Twig as it stands introduces a fair bit of overhead into the theme system. Fabianx indicated that a lot of this is from marking $variables as secure so they're not double escaped later.

Ideally, if Twig autoescape is going to be enabled, then we should just pass raw variables to it and let it do the work. This way, if a template doesn't print the date, or a link, or whatever might currently be check_plain()ed in preprocess, we're not spending all this time creating it for it to be never used. In general, we should be able to remove a large chunk of preprocess work, and just let Twig sort out variables on demand.

Doing this means that a PHPTemplate engine in contrib is going to have to add back a way to securely format variables, but I don't see a way around this if we don't want a serious performance regression.

(For the last paragraph, we have #2528284: Document that alternate Drupal 8 theme engines must implement auto-escape or they are not secure which somehow never got filed as a followup for #1825952: Turn on twig autoescape by default even though it's discussed numerous times on that issue.)

All of the places in core that still escaping variables early, all the places that escape twice or inappropriately, all the explicit SafeMarkup::set() use -- these are all technical debt from the original change and are certainly not introduced by this issue. We discussed this in Amsterdam, and agreed that we needed to remove early checkPlain(), early rendering, etc., in order to get the full advantage of using Twig, reduce the performance and memory regressions, and set consistent expectations about the behavior of variables that might contain user input in Drupal 8.

The expectation that #markup will be XSS-filtered is also not introduced by this issue. It was added in #2273925: Ensure #markup is XSS escaped in Renderer::doRender() which was committed about 5 weeks ago but had been open for more than a year. It's also already been added in several other places in core and is one of the strategies we've been using to consolidate string filtering and sanitization strategies for the main release-blocking meta in #2280965: [meta] Remove every SafeMarkup::set() call. If I grepped correctly, there are 361 non-test uses of #markup in core.

Using render arrays to defer the XSS filtering has many of the same advantages of using Twig's autoescape to defer escaping, and allows us to provide a more consistent and predicable API and stop doing so much processing twice. It is a pattern we should use when a render array is available. This patch helps clarify that by removing a circular dependency and documenting the existing side effects of using SafeMarkup.

As for the inline comments, I don't see why it's a big deal. I found them helpful reviewing the patch and I think it's valuable to help developers get used to the way that string filtering and sanitization is in Drupal 8. I certainly don't think they're required for every string, but in some places it's helpful to clarify or set a specific expectation for what sanitization is expected. If they really bother people we can remove them; I just don't think that's necessary. I certainly don't think having informative inline comments hurts.

For point 3, it just sounds to me like we should clean up the existing docs. I do think it's better to have fewer methods as this makes the API easier to understand.

I haven't yet reviewed the most recent changes, but from my perspective this issue is both ready and important to complete soon since it helps disentangle the problems that have been plaguing #2280965: [meta] Remove every SafeMarkup::set() call.

catch’s picture

Completely agreed with @xjm's #104. We've not got anywhere near as far as I've hoped removing redundant code in preprocess functions, but this patch does get us closer and unblocks a lot of the other work. The performance/memory implications of SafeMarkup are not trivial - it is adding 3mb to each form cache entry on Berdir's site - and this is because it's overused all over the place in core. This patch is a major step towards fixing that.

Couple of things in the patch, otherwise agreed it looks ready.

  1. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -15,9 +15,9 @@
    + * self::xssFilterAdmin() are automatically marked safe, as are markup strings
    

    As David points out - should either add the method or remove the reference to it.

  2. +++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
    @@ -141,17 +141,80 @@ public static function escape($string) {
    +   * @deprecated as of Drupal 8.0.x, will be removed before Drupal 8.0.0. If the
    

    Are we sure we're removing this before 8.0.0 - or should this be deprecated until 9.0.0? I don't mind, just asking.

David_Rothstein’s picture

I thought the original performance rationale for #1825952: Turn on twig autoescape by default turned out to not be too important in the end? (Or at least it's hard to see how - if I run the "Kitchen Sink" text from http://html-ipsum.com/ through Xss::filterAdmin() it takes less than 1 millisecond even though it's a fair amount of text, and it takes a fraction of that for SafeMarkup::checkPlain().) You'd need to be putting a lot of text in unused template variables or building up a lot of unused render arrays for it to matter. The memory usage may be a different story, although of course that wasn't part of the original plan since the problem was only introduced by SafeMarkup itself...

I'm more interested in what core is actually doing now - but looking through some of the 361 uses of #markup I'm willing to concede that it's probably relying on this in some places already :) A little hard to be sure since it involves searching for a negative, but OK. So I agree: We can keep that as is in the patch, then defer further discussion to #2527360: Review all usages of Xss::filter(), Xss::filterAdmin(), and Html::escape().

I do think the repetitive code comments should be removed (#103.1), but don't feel that strongly about it.

I definitely think we should add back SafeMarkup::xssFilterAdmin() though (#103.3) - it's important to have consistent APIs. I will look into that.

David_Rothstein’s picture

Something like the attached.

A couple notes:

  1. -   * @deprecated as of Drupal 8.0.x, will be removed before Drupal 8.0.0. If the
    -   *   string used as part of a @link theme_render render array @endlink use
    -   *   #markup to allow the render system to filter automatically or just allow
    -   *   the Twig to autoescape the value. If the result is not being used
    ....
    +   * @deprecated as of Drupal 8.0.x, will be removed before Drupal 8.0.0. If
    +   *   the string is used as part of a @link theme_render render array @endlink
    +   *   use #markup to allow the render system to filter automatically. If the
    

    I removed the "just allow the Twig to autoescape the value" because it didn't make sense to me (at least not without further explanation). If you're planning to call checkAdminXss() it means you have HTML, so Twig autoescape isn't a valid alternative.

  2. Also in the above, I found "If the string used as part of a render array use #markup to allow the render system to filter automatically" somewhat confusing because while that will work for something that wasn't a render array before, that you're converting to one now (like the examples in this patch), I don't see how it works generally. If you have an existing render array with its own keys, #markup may not be an option.

    I left it alone since it's marked deprecated anyway, but phrased this a little differently in the documentation for xssFilterAdmin() - see what you think.

alexpott’s picture

Re: #105

  1. Fixed
  2. This has been discussed earlier basically the feeling is that a narrower security API is good and as the test shows things that test if safe first are susceptible to unexpected behaviour

re #106

The diff is to #99 as I was preparing this patch before #107 was posted - I've included the changes to the deprecation notice as it make sense to me.

[Edit had to delete previous comment - confusing number of files uploaded]

The last submitted patch, 2506195.108.patch, failed testing.

David_Rothstein’s picture

Adding a method is a lot simpler than taking away.

I guess it's hard to argue with that :) My point is that by not adding it, this issue is introducing an inconsistency in the API. And I think there are definitely legit uses for xssFilterAdmin() even if none are in core...

If we move this to a followup issue let's at least create a dedicated issue for it. We already have a patch (via the above interdiff) and #2527360: Review all usages of Xss::filter(), Xss::filterAdmin(), and Html::escape() is a lot broader and not really the same thing.

You must be secretly conflicted though - the patch still has a reference to "SafeMarkup::xssFilterAdmin()" :) Otherwise looks good.

alexpott’s picture

You must be secretly conflicted though - the patch still has a reference to "SafeMarkup::xssFilterAdmin()" :) Otherwise looks good.

Indeed :) it was a c&p error from the update to the deprecation notice in #107... I should have spotted that - thanks.

@David_Rothstein - given that I think we have an agreed way forward for everything any chance of an rtbc to unblock the postponed issues?

David_Rothstein’s picture

Status: Needs review » Reviewed & tested by the community
catch’s picture

Committed/pushed to 8.0.x, thanks!

  • catch committed 9024fa6 on 8.0.x
    Issue #2506195 by alexpott, joelpittet, xjm, David_Rothstein, Fabianx,...
xjm’s picture

Status: Reviewed & tested by the community » Fixed

I went to commit this but it looks like it's in. :)

David_Rothstein’s picture

xjm’s picture

I published the CR.

mbovan’s picture

FileSize
732 bytes

Comment nitpick, but won't change status btw...

alexpott’s picture

@mbovan can you create an new issue to fix that? Thanks.

mbovan’s picture

miro_dietiker’s picture

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

mpdonadio’s picture

+++ b/core/lib/Drupal/Component/Utility/Xss.php
@@ -83,7 +88,7 @@ public static function filter($string, $html_tags = array('a', 'em', 'strong', '
     // for output. All other known XSS vectors have been filtered out by this
     // point and any HTML tags remaining will have been deliberately allowed, so
     // it is acceptable to call SafeMarkup::set() on the resultant string.
-    return SafeMarkup::set(preg_replace_callback('%
+    return preg_replace_callback('%
       (

A minor followup is needed as the comment no longer matches the code below it (ie, SafeMarkup::set() is gone). Will try to create/patch when I am finished with the current issue I am working on.

mpdonadio’s picture

Followup w/ comment fix is #2556895: Fix comment in Xss::filter().