Problem/Motivation
XSS::filter()
calls SafeMarkup::set()
. It shouldn't because it is entangling filtering and safemarkup setting at the wrong level.
For example:
public function fieldFilterXss($string) {
// All known XSS vectors are filtered out by
// \Drupal\Component\Utility\Xss::filter(), all tags in the markup are
// allowed intentionally by the trait, and no danger is added in by
// \Drupal\Component\Utility\HTML::normalize(). Since the normalized value
// is essentially the same markup, designate this string as safe as well.
// This method is an internal part of field sanitization, so the resultant,
// sanitized string should be printable as is.
//
// @todo Free this memory in https://www.drupal.org/node/2505963.
return SafeMarkup::set(Html::normalize(Xss::filter($string, $this->allowedTags())));
}
Proposed resolution
Stop XSS::filter()
and XSS::filterAdmin()
from setting their results to be safe markup.
Remaining tasks
Commit.
User interface changes
None.
API changes
XSS::filter()
andXSS::filterAdmin()
no longer set their results to be safe markup.- Add
SafeMarkup::filterXss()
as a replacement.SafeMarkup::checkAdminXss
already exists but is deprecated in favour ofSafeMarkup::filterAdminXss()>/code>.
Beta phase evaluation
Issue category | Task because Drupal works at the moment. |
---|---|
Issue priority | Major because SafeMarkup::set() uses unnecessary amounts of memory because strings which don't need to be stored here are. |
Prioritized changes | The main goal of this issue is performance since we have pages like user permissions and module install that are close to our 64mb once additional roles and modules are available. |
Disruption | Slightly disruptive for contributed and custom modules because some things that were marked safe are no longer. All they need to do is swap to the SafeMarkup methods if the strings need to be marked as safe. They will no this if they have double escaping issues. |
Comment | File | Size | Author |
---|---|---|---|
#118 | remove-2506195-118.patch | 732 bytes | mbovan |
#111 | 2506195.111.patch | 40 KB | alexpott |
#111 | 108-111-interdiff.txt | 1.82 KB | alexpott |
#108 | 2506195.108.patch | 39.95 KB | alexpott |
#108 | 99-108-interdiff.txt | 10.27 KB | alexpott |
Comments
Comment #1
alexpottThis was discussion in https://www.drupal.org/node/2399261
Comment #2
alexpottFirst pass at doing this.
Comment #3
alexpottComment #4
alexpottLet's see if anything breaks.
Comment #7
alexpottTagging
Comment #8
alexpottFixing the test failures.
Comment #10
alexpottNeed to find out where
:memory:-prefix
is coming from.Comment #11
pwolanin CreditAttribution: pwolanin at Acquia commentedCan we rename checkAdminXss() to something like filterAdminXss()?
Comment #12
xjmRe: #11, that's out of scope I think (and a BC break). We could deprecate it and provide a new name in a separate issue for 8.1.x but it is not really in scope for the beta.
Comment #13
xjmThis is my favorite part of this patch.
Comment #14
pwolanin CreditAttribution: pwolanin at Acquia commentedLooks good to me, but does this need a change notice?
Comment #15
alexpottCreated https://www.drupal.org/node/2506757
Comment #16
alexpottImproved summary and added beta eval.
Comment #17
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedRTBC, looks great to me.
It is good to keep SafeMarkup coupled to SafeMarkup space.
Comment #18
xjmCan we add to the documentation for checkAdminXss() with similar details, and also explain the difference between the two methods and have them reference each other?
I also see now why @pwolanin wanted checkAdminXss() renamed, because the names between Xss and SafeMarkup are not parallel. I've thought about it and I think it would be in scope this issue to deprecate checkAdminXss() for 9.0.0 and make it a wrapper for a filterAdminXss() with the same functionality, because otherwise I think there is a DX regression with this patch. If we did that, the CR would also need to be updated.
Minor: needs parens for api.d.o to be happy.
Comment #19
xjmI also changed #2506195: Remove SafeMarkup::set() from Xss::filter() to be postponed on this issue.
Comment #20
cilefen CreditAttribution: cilefen commented@xjm I think you mistyped #2505963: Free strings added to the SafeMarkup list in AllowedTagsXssTrait::fieldFilterXss() when they are no longer needed.
Comment #21
alexpottThanks for the review @xjm. Addressed both points 1 and 2 with the patch. I've used the new
SafeMarkup::filterAdminXss()
where the patch was changing stuff to use the new deprecated method.Re #1 I've done the deprecation but thinking about it I wonder if we should suck it up an break the API because I think API's around security should be as simple and as clear as possible. There are only 11 usages in core.
Comment #22
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedHistorical background:
- There is checkPlain and SafeMarkup::check().
Therefore: checkAdminXss was created in parallel to filterAdmin().
The reason is:
While checkPlain AND filterXss() filters no matter what, checkAdminXss() would bail if the string is safe.
Because of the String -> SafeMarkup due to PHP 7 conversion (which I favorited), now both checkPlain and check are on the same class.
----
Just some food for thought:
This needs a little thought overall on what the SafeMarkup API should be:
- There is legitimate use cases for checkPlain I know what I am doing, don't interfere.
- There is also legitimate use cases for: Please autoescape this string for me, if it was not marked as safe before, please escape it.
Overall for both cases there are cases that want to store things as safe strings and things that don't.
e.g. Alex Pott was totally right that marking auto-escaped strings as safe from the twig filter layer is ... interesting or rather just wrong ...
Comment #23
pwolanin CreditAttribution: pwolanin at Acquia commentedI prefer more consistency in the method names, so I am in favor of the patch and of just removing the method that's marked deprecated.
However, I think the patch needs some work in terms of docs. SafeMarkup::filterXss() will do nothing if the string is already marked safe, so you could not, for example, call it with an empty array as the second arg and be sure that you'd actually remove all HTML tags from the return value.
I guess that is the counter-argument Fabien is presenting as to why "check" is better in the name than "filter".
Comment #24
alexpottComment #25
alexpottre #23 actually I think we can't do the isSafe check in filterXss because of the arguments problem - it's why we don't do it in Xss::filter atm. Let's just remove and document that. Since the current implementation is a security bug waiting to happen.
Also notice some double escaping on node/add because Xss::filterAdmin() no longer leads to a SafeMarkup::set().
Comment #26
star-szrI'm wondering if there might be a semi-clever way we can audit to avoid potential regressions like these that aren't covered by tests:
Comment #28
pwolanin CreditAttribution: pwolanin at Acquia commentedLook like we'll need a follow-up to remove uses of the deprecated method.
since we're deprecating that and changing the behavior of the Xss class, this probably needs a change record
Comment #29
alexpott@pwolanin can you review the existing CR https://www.drupal.org/node/2506757?
Comment #30
pwolanin CreditAttribution: pwolanin at Acquia commentedSorry, I missed that CR. Looks fine - this is not a huge change.
I like the better separation of concerns and ability to avoid polluting the safe strings list when appropriate.
Comment #31
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedIt's not clear what this means. If you look at the example conversions in the patch, in some cases it's pretty obvious they are "being used directly in the rendering system" but in other cases not obvious at all.
Also this is missing documentation of the opposite direction (why would you choose to use Xss:filter() over this?) which I think would be more important since I would expect people to use SafeMarkup by default when they aren't sure which to choose. And there should be some documentation on Xss::filter() about this too.
looks very out of date. Not all of it as a result of this patch necessarily, but some of it as a result of this patch :)
Comment #32
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedExpanding on #2 a bit, the existing API is awfully confusing also:
And then this patch adds a couple more.
Seems really important to have an API with a naming convention that makes clear what's actually going to happen to the string. I wouldn't shy away from breaking APIs for this, personally. In an ideal world, something like this:
Comment #33
pwolanin CreditAttribution: pwolanin at Acquia commented@David_Rothstein - I think you have a valid point about the API being confusing and poor naming, but I'm not sure how much change we want to introduce.
The "sometimes filters" options are obviously there to try to reduce double escaping and maybe encourage developers to feel free to call them again if they are unsure. That seems liek a valid goal.
Comment #34
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedRight, I would say #32 is more a roadmap for a possible followup issue. For this issue I think checkAdminXss() could stay exactly as it is, but the new method should always filter, same as Xss::filterAdmin() does.
Comment #35
alexpott@David_Rothstein I'm not sure that
SafeMarkup::xssFilterAdmin()
should always filter - because the list of allowed tags never changes. It is the output of Xss::filterAdmin() that is stored in SafeMarkup's safe list not the input. The XSS filter strips tags, it does not escape them. So at this point this check has nothing to do with double escaping but everything to do with performance. No matter how times you pass<script>alert('here');</script>
to the function it will always be escaped.However I do agree that the methods should be called
SafeMarkup::xssFilter()
andSafeMarkup::xssFilterAdmin()
. I;ve also added more documentation in Xss::methods() to point to the SafeMarkup::methods().This patch also changes all usage of checkAdminXss() to filterAdminXss() so we can see where is being used for further discussion.
Comment #36
alexpottTagging
Comment #38
alexpottSome unintended changes got into the last patch.
Comment #40
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedThe issue is that the output is not consistent. When you call the xssFilterAdmin() function you should know that only the tags on that fixed list are left behind, but it's not the case with this patch:
And it's inconsistent with what the equivalent-named method on the Xss class does:
I guess it's only a security issue if the caller of Xss:filter() allows an insecure tag (and if that happens it's risking a security issue no matter what) but it's still unexpected and inconsistent.
OK, I guess that is better than what I suggested, yes - this way it reads very similar to Xss:filter() and Xss::filterAdmin().
looks like it is appropriate to use the "only filter if it's unsafe" approach, since it's a final fallback to avoid trying to render something unsafe. So I think checkAdminXss should be kept for that, but xssFilterAdmin (which should be used in most places) should not do that check.
Comment #41
pwolanin CreditAttribution: pwolanin at Acquia commentedSo you want to distinguish check and filter? But in fact "checkPlain" always filters, so that seems even less consistent.
Also - the @placeholders are only filtering if unsafe so people could use those without fear of double escaping an already-escaped string.
Comment #42
pwolanin CreditAttribution: pwolanin at Acquia commentedworking on the test fail in the last patchwaiting for alexpottComment #43
alexpottI've reviewed all the current Xss::filter() calls for ones that need to be SafeMarkup::filterXss() - there is not test coverage of putting html into all of these places - and in fact in some places it is impossible - for example a user name. There is entity validation that prevents that.
@David_Rothstein #40.1 is a good point.
I've undeprecated checkAdminXss as @David_Rothstein suggests and added tests based on the situation outlined in #40.1
Comment #44
xjmHaving
checkAdminXss()
,xssFilterAdmin()
, andxssFilter()
is very confusing. At a minimum, we need more detailed documentation on all three of those methods explaining when you should use each.With the current patch, we have all of these methods for sanitizing text in core:
::fieldFilterXss()
Comment #45
xjmFurther thoughts based on that table:
escape()
andcheckPlain()
basically do the reverse of you'd expect based on their respective names.checkPlain()
andcheckAdminXss()
do not have parallel behavior in the current patch, so that also adds to the confusion.escape()
andcheckAdminXss()
are the only methods that conditionally apply the relevant filters. As @David_Rothstein points out, that's very important information for the developer and could lead to missed sanitization or exploits if they're used incorrectly. Therefore, if we choose to retain these methods as they are, I think these methods should include*IfUnsafe()
or something explicitly in the method name, even if it results in long method names, and I think that it is important enough to justify the disruption of removing or deprecating the current methods and provide renamed methods.checkPlain()
would be very hard to justify and it's a Drupalism that's existed since forever, but all the other methods on SafeMarkup are new in D8.SafeMarkup::xssFilterAdmin()
/Xss::filterAdmin()
andSafeMarkup::checkPlain()
/htmlspecialchars()
have a third variant that only applies the filter if the string is not already safe, butXss::filter()
does not have such a variant.escape()
andcheckAdminXss()
, is there any reason to actually even retain these as separate methods rather than removing them and inlining theisSafe()
check?Comment #46
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commented+1 to cleaning this up
Just one comment:
I think checkPlain can be deprecated (even if not removed).
My hit list is we would ideally have:
- A function that always escapes and does not mark safe.
- A function that escapes if unsafe and does mark safe.
I am not sure we need:
- A function that always escapes and does mark safe.
I don't think we need:
- A function that escapes if unsafe and does not mark safe. (as before that was on the safe string list already anyway)
Comment #47
alexpottI don't think we need a function that
That's inbuilt in PHP - it's
htmlspecialchars
Comment #48
alexpottSo how about we proceed here as follows since cleaning up the SafeMarkup methods related to html character escaping should be beyond the scope of removing
SafeMarkup::set()
from the XSS class.The patch adds:
SafeMarkup::xssFilter()
- always filters and always marks safe.SafeMarkup::xssFilterAdmin()
- always filters and always marks safe.The patch deprecates for Drupal 8.0.0:
SafeMarkup::checkAdminXss()
in favour ofSafeMarkup::xssFilterAdmin()
and tells users to useSafeMarkup::isSafe()
if that is important to them. if we want we could mark this as to be removed in Drupal 9.0.0 but I think not havingSafeMarkup::checkAdminXss()
will make solving the escape/checkPlain issue easier.The patch changes:
Xss::filter()
no longer marks safe - it always filters just as in HEAD.Xss::filterAdmin()
no longer marks safe - it always filters just as in HEAD.SafeMarkup::checkAdminXss()
were in\Drupal\Core\Render\Renderer
so add a protected method to reproduce the functionalityThen we can followup this with two issues:
SafeMarkup::checkAdminXss()
- we remove it before 8.0.0 because a clear API around markup security is important. If code has the need to only filter strings not marks as safe it can do the same as\Drupal\Core\Render\Renderer.
SafeMarkup::checkPlain()
andSafeMarkup::escape()
Comment #49
effulgentsia CreditAttribution: effulgentsia at Acquia commentedMy recommendation for changing what's in #44:
With these changes, we'd have:
- htmlspecialchars()
- Xss::filter()
- Xss::filterAdmin()
- SafeMarkup::autoEscape()
- SafeMarkup::autoFilter()
- SafeMarkup::autoFilterAdmin()
I agree we don't need this for filtering, but I think SafeMarkup::checkPlain() might still has valid use cases. In any case, given its high usage count, we should punt any kind of deprecation of it to a separate issue.
Agreed, so this recommendation doesn't have them.
Comment #50
alexpottre #49
This is not possible to do safely because xssFilter() has an argument that affects what is filters.
Comment #51
effulgentsia CreditAttribution: effulgentsia at Acquia commented#49 was an x-post with #48. Re #48:
That's counter to "I am not sure we need a function that always escapes and does mark safe.", but I think you're right to have it. Because if a string was e.g., autoFilterAdmin()'d, some other code might want to filter() it further, with a reduced tag list, even if it is already safe by admin standards.
Comment #52
alexpottre #49 and @David_Rothstein has already pointed out the problems if SafeMarkup::xssFilterAdmin() checks safe-ness before filtering.
Comment #53
effulgentsia CreditAttribution: effulgentsia at Acquia commentedYep, I agree that #48 is the right way to go, so ignore #49.
Even though it means there's a lack of symmetry between
SafeMarkup::escape()
(escape only if unsafe) andSafeMarkup::xssFilter(Admin)()
(filter always), I think that's ok, because escape() is not idempotent, so needs that check, whereas xssFilter*() is idempotent (for the same tag list), so there's no harm in running it multiple times on the same string.In other words for all 3 functions (escape(), xssFilter(), and xssFilterAdmin()), the following contract is satisfied:
And therefore, whether the implementation checks isSafe() or not is irrelevant to what the caller cares about.
Comment #54
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedI just want to point out that we have a strategy parameter within the safeStrings list and within isSafe() and ::set().
Within twig the strategy is used to auto-escape things differently when in different contexts (e.g. css, js, ...).
I had been thinking (but never came to a good conclusion) that strategy might be used for checking explicitly if its already check plained, then filtering further does not make sense, but if filtering something that was filtered with more permissions, then it could still filter it, because it was marked safe for the wrong context.
If we have a hierarchy that should work well (thought we might need to extend the drupalEscapeFilter to check for more strategies.
e.g.
check_plain => filterXss => filterAdminXss
The default strategy is 'html' and what check_plain and drupalEscapeFilter use right now - the to be introduced SafeStringInterface could (unlike Twig_Markup) also either contain a strategy (and for performance reasons allow a public value to access it)
So if something was filtered via XSS, but then is check_plained, it should still be check_plain()'ed.
So we would have:
check_plain: 'html' (and default for everything else)
filterXss: 'xss'
filterAdminXss: 'xss_admin'
If that is too much for this issue, we might consider it as a security hardening for another issue.
Comment #55
alexpottFor me #54 is way too much for this issue - it's out of scope - the point of this issue is to remove the circular dependency between XSS and SafeMarkup and allow people to xss filter without safemarkup setting.
Comment #56
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedGiven #53 that filterXSS is kinda idempotent, this is RTBC.
The plan looks good to me.
Comment #57
cilefen CreditAttribution: cilefen commentedIf this does what it says, it will eliminate this problem #2505963: Free strings added to the SafeMarkup list in AllowedTagsXssTrait::fieldFilterXss() when they are no longer needed, correct?
Comment #58
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedThis one case looks like it actually should keep the old behavior (only filter if not yet marked safe), no?
The last sentence isn't a complete sentence. I think it could just turn into a parenthetical remark at the end of the first sentence. Could be fixed on commit.
Same as above.
"won't will" => "won't" (could be fixed on commit)
Same as above.
Overall, sounds like there is a good plan for a followup issue. I did have a similar thought as @Fabianx, that the "only escape if it hasn't been escaped before" behavior could be made generally useful if the different escaping functions stored their escaped strings separately and didn't mix them together. Mixing them together is really the only thing that makes it problematic currently.
Comment #59
alexpott#58.1 is correct however currently this is totally broken and insecure because element vs elements. Can we fix and test all of this in a follow up?
Comment #60
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedIf the follow-up is opened, I would be fine with #59.
Comment #61
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedSounds fine to me.
Comment #62
alexpottOpened #2525908: HtmlTag render element's prefix and suffix can be marked safe when they are not
And resolved the other points from #58
Comment #63
lauriiiRTBC now
Comment #64
xjm+1 for the updated solution; I think this is a better direction. Agreed about the current scope for the issue and deferring the rest to followups.
Overall, I think there's still some more work to do to clarify the documentation for the updated API. I'll try to work with @alexpott on this documentation later during the sprint. Here are my specific questions and suggestions just from my review:
I thought we said that we wanted to avoid expanding the safe string list during theming and that most
SafeMarkup
use in the theme layer was suspect? Also this could use a render array instead? (I could see that being a followup since it's not directly in scope for this issue.)Two questions about these docs:
SafeMarkup
should basically never be used anymore once the render and theme process was started.Examples/specifics would help, especially as to what "in the rendering system" means. We might also want to rework the overall documentation of SafeMarkup about this and the related changes (that could be a followup).
I still think we need a definitive, clear place where all three SafeMarkup escaping methods reference each other and explain what the purpose of each is, in addition to referencing the API used for escaping.
Maybe we should expand this to mention (or reference) that the caller should consider whether or not to add the isSafe() check as well? Reading this, it sounds like
xssFilterAdmin()
is a direct replacement. While presumably one would eventually figure it out by reading the function code, it still is a bit misleading as it stands.Oh, so this bit clarifies my earlier confusion a little bit about using something "directly" in rendering or not, but I think it would be easier to understand if we explained to the developer why they should use one or the other: to avoid bloating the safe string list with partial strings if the whole result will be marked safe. However, it's more complicated than that as well -- if this is combined with (e.g.)
SafeMarkup::format()
, then the@
placeholder will not work. But if the caller combines aSafeMarkup::format()
withSafeMarkup::xssFilter()
, then they will get the results they expect for the@
.Awesome.
So, looking at this function, I would think that this counts as being used "directly within the rendering system", yet we are using
Xss
and notSafeMarkup
. Is this because of #2525908: HtmlTag render element's prefix and suffix can be marked safe when they are not?I started to point out that this is still essentially doing
SafeMarkup::set()
inRenderer::doRender()
, but that's not this issue; it's [#2506851]. I'd like to do this issue before that one and then explore these either in that one or in a followup to both.+1 for making this protected on Xss.
Do we have a followup yet for what we discussed about cutting back all the excessive escaping for assertion messages in favor of just letting this escape things? (Not in scope; was just reminded.)
Isn't this redundant following #2273925: Ensure #markup is XSS escaped in Renderer::doRender()?
Note to self: Look at the history of these lines. I think we were discussing the inconsistency of these in IRC the other day. There's a followup I think? Should we reference it in an @todo?
Note to self: look at the issue that converted this to the inline template with the safe join.
+1 for the added unit tests!
escape()
andcheckPlain()
yet? I'd like to add @todo referencing it for this patch.checkAdminXss()
etc. We also should check the other SafeMarkup CRs and ensure they are updated for this.Some of these points probably could be taken on by anyone without needing to wait on me or @alexpott. Otherwise I'll work with him more on the docs later. Thanks everyone! I feel pretty good about the direction this has taken.
Comment #65
alexpott#markup
getUserName
says the result should be checkPlain'd. The taxonomy one is only ever actually called from forum - and I'm not 100% certain the vocabulary one is every called. In short this is a bit of a mess - but I think this should all be other issues (not followups to this issue) since this issue is not making any functional change here.The
template_preprocess_node_add_list()
is completely unused so removed as converting to SafeMarkup breaks the documentation that this patch adds.Comment #67
alexpottRerolled.
Comment #69
alexpottWell that shows what I know - I got confused with
node/add
andadmin/structure/types
both of which list node types.Also I think @xjm is correct we should review all Xss::filter/filterAdmin usages in core.
For example:
in MaintenanceModeSubscriber::onKernelRequestMaintenance is really interesting. We could completely avoid the safe string list here if we had ::format that did not mark strings as safe.
I think the Xss::filterAdmin() here can be removed. Opened #2527360: Review all usages of Xss::filter(), Xss::filterAdmin(), and Html::escape() to discuss and move of this set of issues.
The patch attached completely removes SafeMarkup::xssAdminFilter() since actually the calls to this method are always completely unnecessary. The reasons for this are best explained by the documentation added to SafeMarkup::xssFilter().
Comment #71
alexpottThank you
Drupal\comment\Tests\CommentAdminTest::testCommentAdmin
! Entity lists are tables - so need to use['data']['#markup']
.Comment #72
alexpottSome improvements to the docs for SafeMarkup::filterXss()
Comment #73
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedNice!
This is wonderful!
I hope we will have a place for these docs when the method gets removed as it is very very helpful.
This are very helpful docs.
That is a clever way to expose that data.
That method name is so much easier to understand.
Uh, is there any reason not to call isSafe() and call Xss::filterAdmin() here directly?
I agree though it can be follow-up.
I love the comments explaining that.
Also makes me feel it was the right decision to filter #markup.
nit - can be fixed on commit:
Can we use the same:
// #markup is filtered for admin XSS automatically.
here?
Also again a general +1 to keeping subsystems clean of each other.
nit - could add the comment here as well.
nit - could add the comment here as well.
Thanks for great comments!
Maybe $variables['title'] = [ '#markup' => ... ] is better.
Also:
Once we auto-escape, once we don't.
Feels a little inconsistent.
This is truly awesome test coverage!
I now need to lookup why it works ;).
Oh, yes, because we no longer check if its safe before filtering.
Yes, that makes a lot of sense!
Little concern:
Did we define / document that site_slogan is a render array here?
Else it could be made a string by something, then here trying string-to-array conversion.
I know that, because I myself changed site_slogan to render array for hacking SiteBrandingBlock, but I think = [ '#markup' => ...].
Might be safer as the variable is as far as I know not explicitly defined as being a string.
So that feels like the safer variant.
Mostly nits, some comments.
One @todo is marked as @todo and I will defer to alexpott' judgement why he chose to do that in a follow-up issue.
As that one is big already (and there is a dedicated issue for the renderer) however I agree.
--
For now leaving at 'needs review' and awaiting feedback to the review.
It is almost back to RTBC though.
Comment #74
catchOverall looks great, a few nits.
Do we already set $variables['foo']['#markup'] in core? This might need adding to hook_preprocess() docs - it at least looks strange to me but I see the point.
'admin-only use, if not safe' doesn't scan very well for me.
'If it's used' - if what is used?
Does the caller really need to check isSafe() - the function body also does that.
Twig
"it's" is ambiguous again.
Not used?
Is this still used?
This is correct but I'm not sure people reading it will grasp the difference. Can improve later though.
Is there a corresponding use statement that can be dropped?
Same question.
Is this OK?
not used?
Comment #75
alexpottThis is re #73 / @FabianX
= ['#markup' =>$blah]
Comment #76
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedI'm not sure this kind of thing is a great idea. Shouldn't the code that knows how it wants to sanitize something just sanitize it directly (as is the case with the checkPlain in the line immediately above this)?
Relying on code elsewhere to do this makes it harder for someone reading the code to understand what the intention is - in this case, that the site name was entered as plain text but the slogan was entered as HTML.
Comment #77
catchOn the other hand sanitizing on render means that templates which don't print variables prepared in preprocess don't do the work at all. This starts to add up for comments/fields, not so much for this example.
Comment #78
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commented#76 Actually those all should be render arrays by now - but in twig it is really transparent to what is used anyway.
Comment #79
alexpottRe #76 actually the checkPlain is completely pointless there since twig would auto-escape that.
Comment #80
alexpottre #74/@catch
Comment #81
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedAll feedback has been addressed, so happy to RTBC this!
Comment #82
catch#9 I couldn't think of anything better, so yes let's leave like that and improve later if someone thinks of something.
About to head out but will try to remember to commit once back.
Comment #83
xjmPosting notes from a partial review -- need to come back to this after. Assigning to myself.
It occurs to me that a part of #2393329: Replace all drupal_render calls with the service, and inject it, if possible. that we should do now is replacing mentions of
drupal_render()
in docs with the appropriate method on the renderer. Not introduced here.Nit: "this methods"
Overall the deprecation note is much clearer now.
I think this needs to say it will be removed before 9.0.0 or such? It's too late to deprecate something for 8.0.0 except under very exceptional circumstances, and I don't think this is one.
Also "most calls can be removed" with "@deprecated" is a little confusing. @deprecated implies that all calls can be removed if we've done it properly...
The user reading this doesn't necessarily know what a render array is or what #markup means. I think we could add an @see to the render/theme API doc group or the sanitization functions doc groups... which should also probably be the canonical place that we explain this concept.
It's unclear in the sentence whether the parenthetical is an example of something being used directly in the rendering system or if it's an example of something not being used in the rendering system.
Also, not sure about referring to "Twig" here the way that we do with no other references... though we do provide a link in the class docblocks so maybe that is okay, or something we can iterate on in a followup.
+1 in general. I think we could remove the @see to getAdminTagList and isSafe() though. I can do that on commit if you are cool with it.
Overall, the documentation for this method is excellent now.
I wonder if we should pull an overview covering these things on the @defgroup documentation. That can be a followup though.
Same note about "what is a render array". A reference or link could help with that.
Comment #84
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedSure, but is there going to be a followup issue to convert all of those too? Right now the patch leaves things in a very inconsistent state.
And if they are all converted we'd have gone from this:
to this:
which still seems confusing to me in terms of what's HTML and what's not (although I guess I could get used to the idea that #markup is HTML by default).
But the patch currently labels most (but not all) of these instances with a comment like this:
If it's confusing enough that we feel the need to label all these, that doesn't sound great.
Comment #85
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedCrosspost.
Comment #86
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedI found the followup issue: #2527360: Review all usages of Xss::filter(), Xss::filterAdmin(), and Html::escape()
Comment #88
alexpottFixing the test fail - I'm not entirely sure why we're mangling stuff in system_preprocess_block().
re #83
Comment #89
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedInterdiff looks great, back to RTBC.
Agree with #88.3: When I dealt with SafeMarkup issues there was a lot of confusion (partially also because one must learn to get into a mindset of autoescaping first), so anything we can do to reduce that confusion - even if it was breaking things, is a good thing IMHO.
And checkAdminXss was not introduced too long ago, lets just ensure to update the old CRs, please.
#88: Could you address #84?
Leaving assigned to xjm.
Comment #90
alexpottFrom #84
This is exactly the point #markup is treated as html by default. It might be worth us expanding the render api properties so we can specify a list of allowed tags here as well.
In Drupal 8 we have auto-filtering and auto-escaping both of which we should be relying on more so that if a template does not print something we don't waste effort escaping.
Comment #91
joelpittetThis patch is nitpicks fixes only, leaving RTBC.
Though I have a couple questions:
Why are the descriptions removed here in the seven theme?
Do we really need to litter the code with these comments, and if not do you mind if I remove them for you?
Comment #92
xjm@joelpittet:
For #91.1, I asked @alexpott and he pointed me to
template_preprocess_node_add_list()
innode.module
andtemplate_preprocess_block_add_list()
inblock_content.pages.inc
, both of which already add the descriptions, so no need to add it twice.For #91.2, I actually think having inline comments like these is valuable in these cases.
Thanks for fixing the nitpicks; those had been bothering me awhile. :P
Comment #93
xjmSo re: #88.3, I updated the change record to say that it will be removed:
https://www.drupal.org/node/2506757/revisions/view/8635184/8657360
The other change records also needs an update still:
https://www.drupal.org/node/2296163
Comment #94
xjmAlso, not blocking for this issue, but we should merge https://www.drupal.org/node/2311123 in with the main change record and update it to mention render arrays. See #2494297: [no patch] Consolidate change records relating to safe markup and filtering/escaping to ensure cross references exist. I'll comment over there.
Comment #95
xjmComment #96
xjmI also broadened the scope of #2526448: Improve MarkupInterface documentation and render/theme/sanitization API documentation to include updating the theme/render/sanitization docs.
Comment #97
xjmAnother incomplete review, sorry...
Very nitty: no hyphen is needed for "side effects".
Better would be:
"with the tag list provided by..."
Hm, so this is a bit confusing still, especially when contrasted by the examples for filtering with the admin list specifically above. Is there really never a case to use
SafeMarkup::xssFilter()
outside of a render array?Nit: "XSS-safe"
Could be a followup: Since we have
Xss::getAdminTagList()
, should we also provideXss::getDefaultTagList()
? Then we could apply that tag list here if the second argument is empty instead of needing to keep the default in synch with the one inXss
. The only question there is if the caller wants to pass an explicitly empty tag list, but then they should just usecheckPlain()
TBH.I still think a code snippet for "combined with other strings before rendering" would be helpful to grok what this means.
Did I say "yay" yet?
Comment #98
xjmAlright, finally finished my review:
Note that the SafeMarkup use is being removed here because there is a
SafeMarkup::set()
being done on the full result anyway, so there is no reason to pollute the string list with partial sub-strings.That said, the inline comment "Ensure what we are dealing with is safe" seems a bit off now; it should probably say "sanitized for XSS" or such.
Note that I don't quite grok the "this would be done later anyway in
drupal_render()
" comment either before or after this patch -- like ifdrupal_render()
is going to do this then why do we need to here? But that might be something for the followup about checking usages.It seemed weird to me to be adding a use of the function we specifically say we're removing, but a reason that is not entirely clear to me @alexpott says it will be easier to leave this as-is and then remove it in the doRender() issue referenced in the inline comment, which is also a blocking part of the critical and therefore will be done before 8.0.0 anyway.
So I checked the usages of this and both were being added to
#markup
elements, which made me wonder why this usage was here. The reason for this is that the allowed tag list is user-configurable and therefore we need to do the filtering here and also mark the string as safe so that it is not re-sanitized.It's entertaining that we are sanitizing entity labels and descriptions in inconsistent ways in controllers, preprocesses, etc. This could probably be a specific child issue of #2527360: Review all usages of Xss::filter(), Xss::filterAdmin(), and Html::escape().
Note: These are adding unused use statements in the current patch.
I'm concerned that these lines are changing the existing behavior in HEAD from a very permissive filter to a full
checkPlain()
which seems fairly likely to introduce escaping bugs for existing views.@alexpott explained that this was because (a)
#title
means we (I think?) can't use#markup
and (b) UsingSafeMarkup::filter()
here would be inconsistent with the documentation.However, I think I'd rather use SafeMarkup::filter() potentially with a @todo/followup than risk introducing a regression in views titles.
When we change the above back to using the admin tag list, we should also simply change this to be
$variables['title']['#markup']
I think this is an unused use?
Should this also have the
isSafe()
assertion that the next hunk has?Nit: HTML in caps.
How about: "SafeMarkup::xssFilter() with the default tag list will strip the
<marquee>
tag even though the string was marked safe above."And then "SafeMarkup::escape() will not escape the markup tag since the string was marked safe above."
Comment #99
alexpottThanks @xjm - we're !.
re #97
strip_tags()
. That said excellent point about having to maintain two lists of default tags for both Xss::filter and SafeMarkup::xssFilter. Patch attached does things in a slightly different way so we don't do this.re #98
Comment #102
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedAll changes look good, back to RTBC!
Comment #103
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commented"#markup is filtered for admin XSS automatically"
code comment a huge number of times...It's not sustainable that people will remember to add this every time #markup is used (in fact, I pointed out in #84 that even this patch doesn't add it consistently) so why use it in so many different random places?
I think if you're going to make the change this patch proposes you need to own it - the code should speak for itself and a pattern that is used this often shouldn't need a code comment.
Unless someone can point out that this is a pattern Drupal 8 is already using, can we defer that to #2527360: Review all usages of Xss::filter(), Xss::filterAdmin(), and Html::escape() instead? It's not even related to this issue.
Put another way, the patch has an entire paragraph of code comments explaining what you should do as an alternative if you find yourself wanting to call this nonexistent function:
That's a pretty good sign that the function should just be there :)
And the patch is nostalgic for it too - still references it in at least one place:
Comment #104
xjmI disagree with #103 points 1 and 2. The full intent of the original Twig autoescape issue in #1825952: Turn on twig autoescape by default was that we would switch from not escaping by default to escaping by default -- not as a "last-ditch fallback" but as our primary sanitization functionality. See the original issue summary from @catch on that issue:
(For the last paragraph, we have #2528284: Document that alternate Drupal 8 theme engines must implement auto-escape or they are not secure which somehow never got filed as a followup for #1825952: Turn on twig autoescape by default even though it's discussed numerous times on that issue.)
All of the places in core that still escaping variables early, all the places that escape twice or inappropriately, all the explicit
SafeMarkup::set()
use -- these are all technical debt from the original change and are certainly not introduced by this issue. We discussed this in Amsterdam, and agreed that we needed to remove earlycheckPlain()
, early rendering, etc., in order to get the full advantage of using Twig, reduce the performance and memory regressions, and set consistent expectations about the behavior of variables that might contain user input in Drupal 8.The expectation that
#markup
will be XSS-filtered is also not introduced by this issue. It was added in #2273925: Ensure #markup is XSS escaped in Renderer::doRender() which was committed about 5 weeks ago but had been open for more than a year. It's also already been added in several other places in core and is one of the strategies we've been using to consolidate string filtering and sanitization strategies for the main release-blocking meta in #2280965: [meta] Remove every SafeMarkup::set() call. If I grepped correctly, there are 361 non-test uses of#markup
in core.Using render arrays to defer the XSS filtering has many of the same advantages of using Twig's autoescape to defer escaping, and allows us to provide a more consistent and predicable API and stop doing so much processing twice. It is a pattern we should use when a render array is available. This patch helps clarify that by removing a circular dependency and documenting the existing side effects of using SafeMarkup.
As for the inline comments, I don't see why it's a big deal. I found them helpful reviewing the patch and I think it's valuable to help developers get used to the way that string filtering and sanitization is in Drupal 8. I certainly don't think they're required for every string, but in some places it's helpful to clarify or set a specific expectation for what sanitization is expected. If they really bother people we can remove them; I just don't think that's necessary. I certainly don't think having informative inline comments hurts.
For point 3, it just sounds to me like we should clean up the existing docs. I do think it's better to have fewer methods as this makes the API easier to understand.
I haven't yet reviewed the most recent changes, but from my perspective this issue is both ready and important to complete soon since it helps disentangle the problems that have been plaguing #2280965: [meta] Remove every SafeMarkup::set() call.
Comment #105
catchCompletely agreed with @xjm's #104. We've not got anywhere near as far as I've hoped removing redundant code in preprocess functions, but this patch does get us closer and unblocks a lot of the other work. The performance/memory implications of SafeMarkup are not trivial - it is adding 3mb to each form cache entry on Berdir's site - and this is because it's overused all over the place in core. This patch is a major step towards fixing that.
Couple of things in the patch, otherwise agreed it looks ready.
As David points out - should either add the method or remove the reference to it.
Are we sure we're removing this before 8.0.0 - or should this be deprecated until 9.0.0? I don't mind, just asking.
Comment #106
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedI thought the original performance rationale for #1825952: Turn on twig autoescape by default turned out to not be too important in the end? (Or at least it's hard to see how - if I run the "Kitchen Sink" text from http://html-ipsum.com/ through Xss::filterAdmin() it takes less than 1 millisecond even though it's a fair amount of text, and it takes a fraction of that for SafeMarkup::checkPlain().) You'd need to be putting a lot of text in unused template variables or building up a lot of unused render arrays for it to matter. The memory usage may be a different story, although of course that wasn't part of the original plan since the problem was only introduced by SafeMarkup itself...
I'm more interested in what core is actually doing now - but looking through some of the 361 uses of #markup I'm willing to concede that it's probably relying on this in some places already :) A little hard to be sure since it involves searching for a negative, but OK. So I agree: We can keep that as is in the patch, then defer further discussion to #2527360: Review all usages of Xss::filter(), Xss::filterAdmin(), and Html::escape().
I do think the repetitive code comments should be removed (#103.1), but don't feel that strongly about it.
I definitely think we should add back SafeMarkup::xssFilterAdmin() though (#103.3) - it's important to have consistent APIs. I will look into that.
Comment #107
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedSomething like the attached.
A couple notes:
I removed the "just allow the Twig to autoescape the value" because it didn't make sense to me (at least not without further explanation). If you're planning to call checkAdminXss() it means you have HTML, so Twig autoescape isn't a valid alternative.
I left it alone since it's marked deprecated anyway, but phrased this a little differently in the documentation for xssFilterAdmin() - see what you think.
Comment #108
alexpottRe: #105
re #106
SafeMarkup::xssFilterAdmin()
is worth it in #2527360: Review all usages of Xss::filter(), Xss::filterAdmin(), and Html::escape() - atm it has 0 use cases. The title use-cases are incorrect as the admin filter allows tags that are not in H1 see #2530474: Discuss whether | which html tags to allow in entity labels. Adding a method is a lot simpler that taking away.#markup
and know it is admin filtered by defaultThe diff is to #99 as I was preparing this patch before #107 was posted - I've included the changes to the deprecation notice as it make sense to me.
[Edit had to delete previous comment - confusing number of files uploaded]
Comment #110
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedI guess it's hard to argue with that :) My point is that by not adding it, this issue is introducing an inconsistency in the API. And I think there are definitely legit uses for xssFilterAdmin() even if none are in core...
If we move this to a followup issue let's at least create a dedicated issue for it. We already have a patch (via the above interdiff) and #2527360: Review all usages of Xss::filter(), Xss::filterAdmin(), and Html::escape() is a lot broader and not really the same thing.
You must be secretly conflicted though - the patch still has a reference to "SafeMarkup::xssFilterAdmin()" :) Otherwise looks good.
Comment #111
alexpottIndeed :) it was a c&p error from the update to the deprecation notice in #107... I should have spotted that - thanks.
@David_Rothstein - given that I think we have an agreed way forward for everything any chance of an rtbc to unblock the postponed issues?
Comment #112
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedComment #113
catchCommitted/pushed to 8.0.x, thanks!
Comment #115
xjmI went to commit this but it looks like it's in. :)
Comment #116
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedComment #117
xjmI published the CR.
Comment #118
mbovan CreditAttribution: mbovan at MD Systems GmbH commentedComment nitpick, but won't change status btw...
Comment #119
alexpott@mbovan can you create an new issue to fix that? Thanks.
Comment #120
mbovan CreditAttribution: mbovan at MD Systems GmbH commentedYes... #2532082: Replace SafeMarkup::filterXss() with SafeMarkup::xssFilter() in method documentation
Comment #121
miro_dietikerFound a followup: #2532284: HTML is escaped in "all languages" in UI after SafeMarkup change
Comment #123
mpdonadioA minor followup is needed as the comment no longer matches the code below it (ie, SafeMarkup::set() is gone). Will try to create/patch when I am finished with the current issue I am working on.
Comment #124
mpdonadioFollowup w/ comment fix is #2556895: Fix comment in Xss::filter().