Problem/Motivation
In Drupal 8 we enabled twig autoescape but we also created SafeMarkup::checkPlain() to escape text and then mark it as safe to ensure that Twig does not double escape it.
Proposed resolution
We should rely on Twig auto escape and completely remove SafeMarkup::checkPlain()
. SafeMarkup::checkPlain()
is dangerous because if the resulting string is modified in any way, for example nl2br()
, the result will no longer be marked safe. In places where explicit escaping is needed, Html::escape() should be used.
A common pattern is to use:
[
#markup => $value,
'#safe_strategy' => Markup::SAFE_STRATEGY_ESCAPE
]
To allow the render system to do the escaping for us.
Remaining tasks
- Agree that this is desired
- Manual testing
- Review
- Commit
Manual testing steps
(From #40)
A few places to check (because checking the whole patch would be crazy):
- Twig Debug Comments (must turn on twig debugging in your services.yml)
- Site name in Bartik (in header)
- Views token replacement
- A view_mode with an & in the views UI entity view mode choices
- Views exposed form options.
- User names with special characters.
- User Permissions Page
- Quick Edit tool
- Language Content Settings table
- Config Translation Overview
- Image Style edit form
- Comment Author name & title
And anywhere else in your travels, feel free to use simplytest.me for this task. And ask me if you need clarification on those pointer areas. Try to including tags and & in input to see what you get. and looking for where it's encoded in source as & which will show up on the screen as &
User interface changes
None
API changes
Use Html::escape
where explicit escaping is still required.
Data model changes
None
Beta phase evaluation
Issue category | Bug because we have double escaping in HEAD because of incorrect SafeMarkup::checkPlain() usage |
---|---|
Issue priority | Major because SafeMarkup::checkPlain() is hard to use correctly and we should be using auto escape |
Disruption | Not disruptive for contrib or custom because deprecated and removal will be dealt with later see #2555473: Deprecate SafeMarkup::checkPlain() for Drupal 8.0.x |
Comment | File | Size | Author |
---|---|---|---|
#183 | 2545972-2-183.patch | 79.86 KB | alexpott |
#183 | 182-183-interdiff.txt | 2.83 KB | alexpott |
#182 | interdiff.txt | 3.11 KB | Wim Leers |
#182 | 2545972-2-180.patch | 82.51 KB | Wim Leers |
#177 | 2545972-2-177.patch | 81.83 KB | alexpott |
Comments
Comment #2
alexpottComment #3
catchLooks right to me, agreed we need this - function body is identical to check_plain() in Drupal 7, and it's worth having the wrapper in case we ever needed to adjust those arguments again.
Comment #4
alexpottLet's convert one particularly egregious usages of
SafeMarkup::checkPlain()
in this patch.UrlHelper::filterBadProtocol()
really ought not to be setting safe strings - it's only used inXss::attributes()
- which should not be marking anything as safe.UrlHelper::filterBadProtocol()
has existing test coverage inDrupal\Tests\Component\Utility\UrlHelperTest
Comment #5
alexpottI think due to #4 this should now be categorised as a bug.
Comment #6
star-szrI agree we should have this. Patch looks solid and even has test coverage, thanks!
Comment #7
Wim LeersSymmetry! :)
Nit: Could've used
[]
instead ofarray()
.Comment #8
xjmHm, I really feel like this is bloat. We are wrapping a PHP function. Every single method we add to the string sanitization API makes it harder to understand and people are already overwhelmed by it. See all the discussion in #2506195: Remove SafeMarkup::set() from Xss::filter().
The reason
checkPlain()
exists in the first place was a unicode support bug prior to PHP 5.2.x, no? D7 includingcheck_plain()
was basically just BC. In D8 we've retained it even though we renamed it... and now we're going to add yet another copy of it, so the caller has to choose between two different wrappers for a PHP function?Comment #9
xjmAnd yes, to @alexpott's point, I also think decodeEntities() is probably bloat. But at least it's not bloat in what is already an overwhelming space.
Comment #10
xjmI also disagree with the assertion that this is a bug, because the bad code in head should just use
htmlspecialchars()
instead.Comment #11
alexpottFor me the question comes how much to do we want to see htmlspecialchars() all over the place and have to remember the args. And happens when one of those does not match the expectations set by html::decodeEntities() or the SafeMarkup escaping.
Filed #2546232: Remove SafeMarkup::checkPlain() from UrlHelper to address the bug put in this with the current recommended solution and setting back to a task and assigning to @catch as this was his suggestion.
Comment #12
pwolanin CreditAttribution: pwolanin at Acquia commentedIt looks as though we could (maybe *should*) call
ini_set("default_charset", 'UTF-8')
in DrupalKernel. Then we could omit the 3rd argument in usages since it defaults to UTF-8 in PHP 5.5 and to the default charset in 5.6+.The only risk would be someone putting the wrong second argument (or no argument) in a case like HTML attribute values where we rely on all quotes being escaped.
Comment #13
alexpottThere was discussion about the use of
htmlspecialchars()
on #2505701: Document SafeMarkup::set and Use htmlspecialchars() directly in Attribute() so we don't bloat the list of safe strings see comments 11 and 12.Was hmm okay at the time but now looking what's likely to be thrown up by #2527360: Review all usages of Xss::filter(), Xss::filterAdmin(), and Html::escape() I think a method makes sense. Especially as SafeMarkup::checkPlain() actually makes no sense outside of SafeMarkup::format().
Comment #14
catchIf we have Html::encodeEntities() that provides a place to document where it's used. Calls to htmlspecialchars() with two optional arguments don't provide anywhere for this clarification to happen.
Yes it comes down to this, and also whether those arguments to htmlspecialchars() might ever have to change in the future - as #12 suggests they could (although at least no bc break for now).
Comment #15
alexpottOr we can do the proper thing and remove
SafeMarkup::checkPlain()
. The patch attached reveals numerous problems with our use ofSafeMarkup::checkPlain()
.Some examples of currently wrong usage:
$href = '<' . SafeMarkup::checkPlain($attributes['href']) . '>;';
$variables = array('%directory' => $directory, '!htaccess' => '<br />' . nl2br(SafeMarkup::checkPlain($htaccess_lines)));
$context['message'] = 'Updating ' . SafeMarkup::checkPlain($module) . ' module';
$import[] = '@import url("' . SafeMarkup::checkPlain(file_create_url($next_css_asset['data']) . '?' . $query_string) . '");';
'#suffix' => SafeMarkup::checkPlain($comment->label()) . '</li>'
'#markup' => '<span lang="en">' . SafeMarkup::checkPlain($source_array[0]) . '</span>',
$options[$vocabulary->id()][$term->id()] = str_repeat('-', $term->depth) . SafeMarkup::checkPlain($term->getName());
By removing
SafeMarkup::checkPlain()
we begin to fulfil the promise of Twig's auto escape feature AND we make double escaping far less likely.Comment #16
alexpottAnother example of how SafeMarkup::checkPlain() makes things difficult #2503963: XSS in Quick Edit: entity title is not safely encoded
Comment #17
alexpottThe old code here is basically asserting that everything is double escaped.
Comment #19
alexpottOne of the failing tests
Comment #21
joelpittetThis should knock off one more of those the Drupal\views\Tests\Plugin\DisplayFeedTest.
To note, this is currently broken currently broken in HEAD because if a field value has a new line character, the SafeMarkup'd string is now dirty when the
<br>
tags are added it will be passed to the link and won't match the SafeMarkup::safe_strings list, thus escaping it.I'm not sure my solution is the right one, but at least it will do the trick for the time being.
Really like the direction of this patch though, big fan!
Comment #23
joelpittetWell that sucks, fixed one test, broke 4 more:S @alexpott you set traps! I'm not so sure the search title's should be escaping those values? @see interdiff The UUID was just a casting problem.
Comment #25
dawehnerIt is odd to have Html::encodeEntities vs.
SafeMarkup::checkPlain
, so can we implement a aSafeMarkup::encodeEntities
calling to::checkPlain
or wise versa?Comment #26
alexpottSo this fix is interesting! Because this means the the original code...
Is broken in some instances because the nl2br will cause the string to lose safe-ness.
Also this fix will reopen the debate about SafeString::create() and whether it should be @internal. However SafeString::create() + Html::encodeEntities() is better than SafeMarkup::checkPlain() because it lacks any side effects. I will investigate if there is another fix.
Comment #27
alexpott@dawehner the latest patches are completely removing SafeMarkup::checkPlain()
Comment #28
dawehnerMeh, so I reviewed the other issues for fun, meh
Comment #29
alexpottSo a non safe-string solution it possible just means doing more rendering (at some point we probably need to discuss whether this is desirable but I guess render cache and all the other caching efforts saves us here).
Patch is rerolled on top some recent commits that have removing some SafeMarkup::checkPlain() calls.
The issue summary needs a major update. Also I think we should open another issue to remove SafeMarkup::escape() that is postponed on this one.
Comment #30
alexpottOh and patch in #29 improves the assertEscaped to both look for the escaped text (which it did) AND ensure the raw text is not there.
Comment #32
alexpottSo the improvement to assertEscaped was ill-advised I guess we can explore that in a followup as it caused fails in
Drupal\system\Tests\Menu\BreadcrumbTest
andDrupal\field\Tests\FormTest
and is not directly related to this issue - I only added it because of the improvements made toDrupal\views\Tests\SearchIntegrationTest
by this patch.Last test fail fixed too.
Comment #33
alexpottReclassifying as a bug since this patch resolves quite a bit of double escaping and removes a lot of unnecessary strings from the SafeMarkup safe list.
@joelpittet I didn't understand
From the issue summary so removed it.
Comment #34
alexpottComment #35
alexpottOpened #2549393: Remove SafeMarkup::escape() and rely more on Twig autoescaping and unassigned @catch since this is now ready for anyone to review.
Comment #36
alexpottStarted a CR https://www.drupal.org/node/2549395
Comment #37
joelpittetWhoops that was part of a comment that ended in the issue summary related to tracking down the views feed test failure in stringformatter.
Regarding more rendering, we'd like to avoid early rendering as much as possible, but in this case it may already be rendered... I'll have to keep an eye out in a review.
Comment #38
alexpott@joelpittet I didn't say "early rendering" - I said "more rendering" this patch does not add any early rendering.
Comment #39
joelpittet@alexpott sorry was on phone checking email, hadn't read the patch yet and read them as the same thing. Fewf
Comment #40
joelpittetI'll add a test tonight at the airport for the bug found in #21 which should be resolved now.
I'm adding a "Needs manual testing" because with this change there is a possibility that we double escaped a few things by accident.
A few places to check (because checking the whole patch would be crazy):
&
in the views UI entity view mode choicesAnd anywhere else in your travels, feel free to use simplytest.me for this task. And ask me if you need clarification on those pointer areas. Try to including
<tag>
tags and&
in input to see what you get. and looking for where it's encoded in source as&amp;
which will show up on the screen as&
Comment #41
dawehnerNote to myself, stopped before views.
+1 for not filtering/escaping here
Mh, don't we want to just escape on output time?
Do we really need to Html::encodeEntities what is passed into #markup? Just curious here
This is odd I mean do we really need that? Maybe we could add some small comment?
Why do we need to escape a known string?
So there are places in the patch which use #markup with Html::encodeEntities and without. I think we should make it clear, which one is the one to use
If I understand template_preprocess_table() correctly it will then be passed to
$variables[$section][$row_key]['cells'][$col_key]['content'] = $cell_content
which is a variable which will be escaped by twigSounds like a classical usecase where we should apply render side escaping ... This is used in
autocomplete.js:renderItem
So just as example I guess the entire resulting token string will be marked as safe then?
It is interesting that we filter first and then still encode the entities
Just went back in time and indeed this was never used
Nice, this is quite better.
Nice one!
So the result of comment_node_update_index() is safe because we use render API there?
This should be fine, as the result is rendered as Link using the LinkGenerator at the end
So AttributeValueBase/AttributeString escape key and value so we are safe here
Btw. it is really great that you no longer have to escape for #type select options. This is quite a DX win
Out of scope of this issue but I guess those instances should use
$sanitize ?
as well?Comment #42
tim.plunkettWow EntityListBuilder::getLabel() is now completely pointless :)
Comment #43
mbrett5062 CreditAttribution: mbrett5062 commentedChanged issue title from "Replace" to "Remove" as per #27
Comment #44
dawehnerMh, so these variables are used in JS, does that mean we need some protected there instead?
Comment #45
Wim LeersEscaping in JS would be consistent with #2503963: XSS in Quick Edit: entity title is not safely encoded.
Comment #46
alexpottSince this in a chain I'll postpone in on #2549791: Remove SafeMarkup::xssFilter() and provide ability to define #markup escaping strategy and what tags to filter. I basically wrote all these patches first to prove that it was actually possible to do. Now that that is proved I'll adjust the issue statuses accordingly.
Comment #47
catchI opened #2550945: Add Html::escape() to talk about adding Html::encodeEntities() or not. I think the wrapper is useful both in the interim state before this overall patch gets committed and after.
Comment #48
ericjenkins CreditAttribution: ericjenkins at Digital Bridge Solutions commentedI'm sprinting at the Midwest Developers Summit in Chicago right now, and I'm taking a look at #40.
Comment #49
ericjenkins CreditAttribution: ericjenkins at Digital Bridge Solutions commentedRe-rolled #32. Fixed conflicting lines in
core/themes/engines/twig/twig.engine
, preferring #32 lines over changes made elsewhere.Comment #50
ericjenkins CreditAttribution: ericjenkins at Digital Bridge Solutions commentedI'm no longer working on this ticket. Planning to pick it up tomorrow.
Comment #51
ericjenkins CreditAttribution: ericjenkins at Digital Bridge Solutions commentedPatch requires another re-roll. Addressing this now.
Comment #52
ericjenkins CreditAttribution: ericjenkins at Digital Bridge Solutions commentedFixed merge conflict in
core/modules/node/src/Plugin/Search/NodeSearch.php
, favoring Patch #32 line 338 over conflicting line. Re-rolled.Comment #53
ericjenkins CreditAttribution: ericjenkins at Digital Bridge Solutions commentedUpdating the Teaser label in the core.entity_view_mode.node.teaser.yml file to include an '&' character causes double escaping as a result of Patch #52. Picture:
Steps to reproduce
I'm done working on this issue for the day.
Comment #54
joelpittetHere's a re-roll and this is no longer postponed, but it does need the review @ericjenkins did in #53 taken into consideration and after that is resolved, more manual testing.
Which is cases where the escaped value gets passed into link generator.
Comment #55
joelpittetComment #57
Wim Leers#2549791: Remove SafeMarkup::xssFilter() and provide ability to define #markup escaping strategy and what tags to filter just landed, this is now fully unpostponed.
Comment #58
keopxComment #59
Wim LeersComment #60
alexpottHere's a massive reroll based on #32 and with all SafeMarkup::checkPlain() references in code removed. Also used the new
Html::escape()
instead ofHtml::encodeEntities()
that the original patch added. Still need to handle all the feedback in the issue. I'll start addressing that once this is green again.The interdiff is the result of the removal of all the checkPlain in comments and cleaning up the re-roll. Unfortunately it is not that useful.
Comment #61
alexpottNow let's use the new
#safe_strategy
.Comment #63
mpdonadioUpdated IS to match approach in #60.
Comment #64
dawehnerNote #41 is still true (mostly)
Comment #65
alexpottFixing the test fails... now to address all the feedback since finally the patch is up-to-date with head.
Comment #66
Wim Leers#63++
#60: interdiff review:
This docs update looks a bit strange. Why are we explicitly recommending
SafeMarkup::format()
here, while we were not before?And while that makes some sense, I don't think the mention of
Xss::filter()
still makes sense; I think that should point to#markup
's#safe_strategy
property?Why explicitly mention it here? We don't do that elsewhere? Specifically because this is the node title and that'll cause people to raise eyebrows when they read this?
Pre-existing technically, but reading
sounds scary. If it said or something like that, that'd be less scary :)Shouldn't this instead be recommending to use
['#markup' => 'title', '#allowed_tags' => […]]
?Shouldn't this also mention
#markup
+#allowed_tags
?s/the render system/the Markup element's "escape" safe strategy/
s/markup tag/tag/
s/marked/marked safe/
#61: interdiff review:
This one is wrong.
drupal_render_root()
already explicitly returns a safe string, and we specifically still want to escape it because we want to preview the source (markup) of the feed.Could use specific documentation explaining this probably.
This would also fix 3 of the 4 failures AFAICT.
Next: another complete review.
Comment #67
Wim Leers#65:
Hah, that's definitely an alternative solution to the problem I also found in my review of #61 in my previous comment. But I think it is much clearer to use
Html::escape()
, because that's more explicit.Comment #68
alexpott@Wim Leers the problem with using
Html::escape()
is that you will then XSS admin filter the for no reason.Comment #69
alexpottComment #70
Wim LeersChatted with Alex Pott about #67/#68. It doesn't make sense to use
Html::escape()
, because that doesn't mark the result as a safe string. Which means that#markup
will still attempt to do filtering. Which is excessive work. Hence, the approach that #65 uses already is the best option.I do think the comment can be made clearer though, IMO the intent is only clear from context and would be better spelled out explicitly. i.e.
->
(Or something like that.)
If you disagree that that's better, it's fine to leave it as is.
Comment #71
Wim LeersAnd finally, the full review of #65, as promised.
It's a pre-existing problem, but this will cause
Markup
's filtering safe strategy to still be applied. Even though that's absolutely unnecessary.The only tag that is going to be in there is
<br>
, by definition. And that is allowed by\Drupal\Component\Utility\Xss::getAdminTagList()
.However, using
SafeString::create()
here or creating anotherSafeStringInterface
implementation for this also feels kinda overkill.Not sure what's best.
But if we don't improve this here, I think we should at least document it, because in the future we won't be scrutinizing string safeness etc quite as much again.
Why?
$string
is already escaped.In #2503963: XSS in Quick Edit: entity title is not safely encoded, it was determined that this escaping actually belongs on the client side.
Unused uses:
SafeMarkup
&Xss
.LOL, yes, that was dead code for some time already. Yay for no more dragging it along.
Much nicer :)
Lots of side effects removed (needless additions to the safe markup list), yay!
Hm. These
#options
changes are in many places. Can you explain why this is okay? Where are they being escaped now?Unused.
Can't these be updated to have the
<span>
s in#prefix
&#suffix
?<3
This could be implemented more cleanly by doing:
SafeMarkup
is still imported.Unused.
Isn't it effectively testing the same thing? A title that still needs to be escaped (and is not yet marked safe) and a title that already is escaped (and already is marked safe)?
I don't see the difference.
<3
This already uses
assertEqual()
, so why is this change necessary?Unused.
Does this need additional test coverage?
Unused.
Comment #72
alexpottSo started at #41 - thanks @dawehner
#safe_strategy
<!--action-bulk-form-select-all-->
'#safe_strategy' => Markup::SAFE_STRATEGY_ESCAPE
now. The render element Markup contains documentation as to how it escapes or filters text.re #44 - thanks @dawehner again - I shouldn't have removed that. Adding back escaping. Escaping and JS should be a separate issue.
Still to address #53 - thanks for finding this @ericjenkins
#60 - thanks @Wim Leers
re #70 - thanks @Wim Leers (again!) - fixed this by using your better comment.
#71 - thanks @Wim Leers (again!)
I've added a comment. TranslationWrappers are not as simple as SafeStrings.
Comment #73
Wim LeersAwesome reroll, Alex, thanks!
// @todo: …
in the code. Also related to this: #44, #71.3.'#weight' => -1000
). Then, when we render, there's no more need for us to prefix the rendered output with<h1>TITLE</h1>
— the rendering already took care of that for us, and we just get a single safe string!Interdiff review:
s/translate/translate()/
s/tranlsation/translation/
:)
So, the only remaining bits:
Comment #74
alexpott#73
re #53 - fixed. The problem was that EntityRow was doing some unnecessary escaping - Twig'll do that for us! Added test coverage.
So still to do:
Comment #75
Wim LeersSo the only remaining thing is then #71.8.
Comment #76
Wim LeersFollow-up filed for #71.1: #2554755: BasicStringFormatter's nl2br() forces escaping *and* filtering — improve this.
Comment #78
alexpottFixed the tests...
So I was wrong about #71.8 - by default checkbox and radio values are admin filtered because we use #title to print them out. I think this is the wrong behaviour because it does not really meet the expectations set out with auto-escape. We have two options:
The patch attached goes for the latter since in discussion with @Wim Leers we think it makes the most sense. Let's see if anything breaks.
Comment #80
alexpottSeems I broke some tests too :)
Comment #81
Wim LeersI think this is ready.
We still need that manual testing though — see #40 for steps.
Comment #82
xjmComment #83
xjmSo FWIW, I'm tentatively in favor of making this change, despite the BC break it causes, because (a) it will be a clean break (so easy to find and fix) and (b) it has significant performance, security, DX, etc. implications.
However, I do think we need to have a discussion about the change with the committers before we commit this one, because it is so sweeping.
Also, has anyone looked over what other change records might need to be updated? It'd be good to have a plan for that as well. (We should search all the CRs for check_plain and checkPlain, etc.)
Comment #84
alexpottHere's the current CR situation:
https://www.drupal.org/node/2506757 - Xss::filter etc no longer marking things as safe - needs an update for #2549791: Remove SafeMarkup::xssFilter() and provide ability to define #markup escaping strategy and what tags to filterhttps://www.drupal.org/node/2528662 - Token cacheability - needs update to useHtml::escape()
Comment #85
dawehnerRight if we break BC we should break it hard!
Comment #86
alexpottDiscussed with catch and he suggested that we don't do the removal in this issue. It has enough to do without doing that. Also considering how widely used this is in contrib it would be good to have a release with the method deprecated and give time for everyone to convert their modules. Here's a patch which keeps
SafeMarkup::checkPlain()
and it's tests but removes all usages from core.This is also a reroll.
Comment #87
Wim LeersShouldn't this patch then mark it for deprecation? (Presumably in beta 16, so that during beta 15, developers can upgrade.)
Comment #89
alexpott@catch suggested doing this in another issue - so we can get this in a discuss the deprecation in another issue - this one #2555473: Deprecate SafeMarkup::checkPlain() for Drupal 8.0.x
Comment #90
alexpottOne of the new SafeMarkup::set() patches added a weird unit test.
Comment #91
Wim LeersPatch is green. Before RTBC'ing, we still needed that round of manual testing described in #40. None of the exact steps showed problems, but in the process, I did find two double escaping bugs, which are actually the same bug: the Frontpage view's title has a double-escaped site name, this shows up on the actual frontpage, but also in the preview.
However… this is a pre-existing bug in HEAD!
Hence: RTBC. (I think a patch needs to be RTBC for a framework & release manager to even look at the patch? I could be wrong. Feel free to un-RTBC if I'm wrong.)
Comment #92
Wim LeersAnd issue filed: #2555503: Site name from frontpage view is double-escaped on frontpage and in preview.
Comment #93
catchSo I suggested deferring the deprecation and removal to another issue, so that we could get the main patch in without a full discussion on contrib impact. That should mean we can remove both tags from this issue (apart from it needing to be reviewed by a committer), but assigning to xjm since she added the tags to confirm that was the only thing that needed a wider discussion.
Comment #94
joelpittetThe underlying problem with #53 seems to be if you filter or escape a known safe value and pass it in to the link generator, it will escape it again. The filter one is pre-existing in head because we removed the
SafeMarkup::set()
safe from it in a recent issue.I'm glad we removed the premature escaping, and in most cases may be the correct course of action but I feel that part will pop it's head up again.
So as far as I see, removing the premature escaping is great fix but only treats the symptoms. If the solution is from:
to
I think we should add that to the change record so people know how to deal with it? If that is not correct and I'm way off base, then the correct way to deal with those pre-escaped variables would be nice to cover in the CR regardless as it seem to cause me and others a bit of confusion.
Comment #95
xjm#93 makes sense to me too.
I found other CRs not mentioned in #84:
https://www.drupal.org/node/2073933
https://www.drupal.org/node/1895936
https://www.drupal.org/node/1876852
https://www.drupal.org/node/2457593
https://www.drupal.org/node/2372691
https://www.drupal.org/node/2153775
https://www.drupal.org/node/2083471
https://www.drupal.org/node/2067859
https://www.drupal.org/node/1876710
https://www.drupal.org/node/1762604
but those we could address in the issue that does the deprecation.
Comment #96
pwolanin CreditAttribution: pwolanin at Acquia commentedI'm a little confused by this flag
Can we define a different element like
#plain_text
instead of overloading#markup
?If not can we make the constant at least shorter? like
Markup::ESCAPE
Comment #97
joelpittetAs much as I love this issue, #94 needs to be addressed and if we can't I'd even propose reverting #2506195: Remove SafeMarkup::set() from Xss::filter() until it can be.
Edit: wrong referenced thing to revert
Comment #100
pwolanin CreditAttribution: pwolanin at Acquia commented@joelpittet
part of your question doesn't make sense. Since l() escapes the title, I think you'd never do this:
Rather you'd just do:
For the other you'd use a render array like:
However, per my comment in #96, I don't like overloading #markup for plain text also.
Comment #101
joelpittetMy cases are assuming you have markup you don't want to double escape and have previously sanitized.
It does make sense because the fact that l() will escape unsafe strings and since they aren't marked safe any more to prevent the escaping in l(), they will be double escaped now.
So passing a render array to the title is another approach that maps to xss admin filter if not safe. Will prevent double escaping but may do xss filter twice.
Comment #102
Damien Tournoud CreditAttribution: Damien Tournoud commentedI agree with @pwolanin.
'#safe_strategy' => Markup::SAFE_STRATEGY_ESCAPE
is just plain wrong, as it mixes two radically different issues: the issue of the format (plain text vs. html), and the issue of the sanitization.This looks fine to me:
Unfortunately, the
Markup
element is hardcoded inside\Drupal\Core\Render\Renderer
so it is the only one that can omit#type
right now.Comment #103
joelpittetThe examples are contrived to be succinct. Imagine those variables were filter/escaped into a variable and passed in.
Comment #104
alexpott#94 @joelpittet for Drupal::l() and wanting to pass markup use #markup, if you want it escaped just do what @pwolanin says in #100. If what you are passing to Drupal::l() has already been rendered then just pass it in - it will be a SafeString already. If it has already been escaped or filtered and its not already rendered you're doing it wrong.
As for the objection about overloading
#markup
with plain text. This issue does absolutely nothing to change that.. This issue converts[#markup => SafeMarkup::checkPlain($something)]
to use a later escaping strategy. I'm intrigued by the idea of introducing#plain_text
and removing the#safe_strategy
key. But there might be a key benefit of the current approach that is being overlooked. It makes it simpler to change the strategy whilst rendering. For example, at the moment node titles are escaped. However it is not uncommon to what to change this an permit a limited set of tags in the title. In HEAD atm the moment you could just change the strategy toMarkup::SAFE_STRATEGY_FILTER
and set#allowed_tags
to the desired list.@pwolanin I'm not too fussed about the constant names and would be happy to review an issue suggesting the change but one advantage with the longer name is that it tells you what it is for.
Comment #105
pwolanin CreditAttribution: pwolanin as a volunteer commentedPossibly we should just go ahead here, but I think the DX is a bit crummy and we should have a follow-up to improve it.
In terms of omitting
#type
I think it could be ok to let themarkup
type handle both#markup
and#plain_text
. Basically instead of passing a#safe_strategy
just use a different key name.Comment #106
alexpottCreated #2555931: Add #plain_text to escape text in render arrays - if we are going to do that then I'd like to get that one done before this.
Comment #107
joelpittetSorry for raising alarm bells, I am only trying to make sure people who run into double escaping after this change lands that they will know how to "not do it wrong". And that we add that to the change record so they can get support and pointers in the right direction.
So we are recommending people (obviously I know), let the
Drupal::l()
escape the text passed in and avoid pre-escaping unsafe markup withSafeMarkup::checkPlain()
or it's replacementHtml::escape()
if they were doing that before. Alternatively, pass in['#markup' => $string_with_markup]
if you want to filter out Xss but you are fine with or expecting un-escaped(no xss) tags and characters. If the value has previously runHtml::escape()
from somewhere else in the system, then you (a: might be doing it wrong? b: should pass it through['#markup' => $pre_escaped]?
. And we are not recommendingSafeString::create()
's usage for known pre-escaped strings?I have no opinion on the #plain_text comment because I don't understand it's use yet, though a follow-up seems like a likely candidate considering this patch's size. I'll follow the follow-up to try and get a grasp on it.
Comment #108
xjmI agree with #102 (regarding the problems of conflating escaping with filtering), #105 (that the DX for this whole space is definitely "crummy"), and #107 (that we need the change records and other documentation to communicate what people actually need to do. These problems are not introduced by this patch, though. The one change is that this patch does make a much wider use of the
#safe_strategy
introduced in #2549791: Remove SafeMarkup::xssFilter() and provide ability to define #markup escaping strategy and what tags to filter, which in turn was addressing limitations of the change made in #2273925: Ensure #markup is XSS escaped in Renderer::doRender(), which happened in response to the original SafeMarkup API needing to conflate escaped and filtered text in the first place, all of which put a lot more on the semantics of#markup
.So I think we definitely need to address those issues in followups. I do think the change record updates probably need to block this though. I'm planning to review the patch here (either as a reviewer or committer if it gets RTBCed again by someone else) and I'll see if I can help with that.
Comment #109
pwolanin CreditAttribution: pwolanin as a volunteer commentedFrom #106, maybe this should be marked postponed on #2555931: Add #plain_text to escape text in render arrays
Comment #110
alexpottHere an issue that replaced all usages of SafeMarkup::checkPlain() in a module with a relatively complex UI - https://www.drupal.org/node/2555045#new
Comment #111
xjmSo, this is interesting in light of the concerns above about URL escaping.
I thought "Huh, surprised that's not a method on the URL class now," and then t turns out
check_url()
is kind of redundant with UrlHelper::filterBadProtocol()... but for an entity decode that's in the method but not the function. This means (I think) that those two functions would have slightly different results on a URL that was passed in ashttp://www.example.com?a=1&b=2
. (@alexpott also pointed out that the case where thedecodeEntities()
would matter isn't actually in the unit tests.) Regardless, I think thatcheck_url()
should be deprecated in a followup to just become a wrapper forfilterBadProtocol()
.There are places that
check_url()
is used with@
tokens fort()
in HEAD that could result in double-escaping following the removal from the safe list ofcheckPlain()
results for the URLs. @alexpott is updating the patch for this. @todo file followup.Comment #112
alexpottNice catch wrt to
check_url()
. I think we need to deprecate this function too. It can be replaced byUrlHelper::stripDangerousProtocols()
andUrlHelper::filterBadProtocol()
.UrlHelper::stripDangerousProtocols()
should be use when the result can be escaped by SafeMarkup::format() or Twig.UrlHelper::filterBadProtocol()
should be used when you are sanitizing for other reasons - for example tokens.Setting back to needs review since I now think this can go in parallel to #2555931: Add #plain_text to escape text in render arrays - that issue is likely to go in first but if this one does get in first then that one just gets larger.
Comment #113
xjm(Just taking notes during review.)
This is entertaining. The lines for
$htaccess_lines
are statically defined in code in FileStorage::htaccessLines(). The purpose of the escaping here is not to sanitize user input, but actually to encode the angle brackets in the example.htaccess
for display to the user inside a code tag. I think this may be a double-escaping bug in HEAD regardless of whether a!
or@
placeholder is used (which would be a regression introduced by #2399037: String::format() marks a resulting string as safe even when passed an unsafe passthrough argument). Related: #2506445: Replace !placeholder with @placeholder in t() and format_string() for non-URLs in tests (which in its current form also leaves this bug there). This one is out of the scope of the patch and the patch does not affect it.Comment #115
xjmSo, bit confusing, but turns out the topic documentation for the Batch API is in
form.inc
. Cool...This updated documentation is a bit unclear to me and I had to read it several times and discuss with @alexpott to understand. The change is that previously, passing unsanitized text could result in XSS, whereas now, passing any HTML in the messages will result in escaping unless the strings are in the safe list...
...Or so the updated comment claims, but actually, I think that at least parts of it will not get escaped by Twig. See in _batch_process_page(). The
'init_message'
and'error_message'
get added in JS so it's not clear to me they would actually be escaped by Twig. Related: #2503963: XSS in Quick Edit: entity title is not safely encodedIt's not uniform either; there's a fallback case where
'error_message'
does get passed to renderBarePage() in a#markup
which eventually goes throughrenderRoot()
and therefore gets XSS-filtered in that case but not obviously in the other.@alexpott also found the use of the
'message'
key in_batch_do()
and it is similarly put into a JSON response, so therefore has the same escaping concern.Per @alexpott the
'results'
key in the$context
array is used by the specific implementation, so the comment might hold true for that one. Regardless, at least part of this docs change is incorrect and could result in XSS vulnerabilities in batch implementations if they try to follow this documentation.Also, the updated documentation is a bit confusing and switches between passive voice and second person, but we can worry about that once we make it say the right thing.
Twig will auto-escape these variables, except when it won't. It's definitely a lie about the second line now.
OTOH, as @alexpott pointed out, the first example line in HEAD is a
autodouble-escape bug waiting to happen.Comment #116
xjmAnother example in HEAD of where we protect site builders from XSS exploits introduced by module authors by embedding XSS (or is it SSS, "same-site scripting"?) in their module names. :P But if this were actual user input, I think this is a case where an XSS bug would be introduced.
@todo for self find the code change relevant for this docs change.
NIH, but this hunk of documentation is already a little misleading in HEAD.
Xss::filterAdmin()
has way more usecases than non-input-filter-filtered fields.Comment #117
xjmNote to self: @alexpott says that the support for render arrays here was introduced by #2549791: Remove SafeMarkup::xssFilter() and provide ability to define #markup escaping strategy and what tags to filter and there is discussion on that issue. The docs for the interface just didn't get updated. (
title_resolver
service used for_title_callback
on routes etc.) Edit: the discussion apparently actually was in IRC; on that issue there is just comment #22.Comment #118
xjmIt might be worth testing manually to ensure no double-escaping bug is introduced here.
Comment #119
xjmI confirmed this protected method is only used in core by the base implementation's
render()
method and that only the block implementation overrides it, and that in both cases, the label will get escaped anyway (so thecheckPlain()
is redundant). However, it is possible that a contrib or custom list builder implementation that used the label in a different way could have an XSS vulnerability following this patch, because this patch does change the documented behavior of the method. It is a correct change (see discussion on #2503963: XSS in Quick Edit: entity title is not safely encoded), but it might merit its own change record for this reason.This is similarly changing the documented behavior of SelectionInterface::getReferenceableEntities(), which says (emphasis added):
This is also a public method and, being Entity Reference and therefore JavaScripty, something we should check very carefully for potential XSS.
Comment #120
xjmhttps://twitter.com/xjmdrupal/status/313805301931446273
Comment #121
xjmOne suggestion I had for issue management here was to split off all the things that are straight conversions of
checkPlain()
toHtml::escape()
because there is no risk of introducing XSS vulnerabilities whatsoever, and the only thing they need to be checked for is the introduction ofdouble-(edit:single- or double-) escaping bugs that lack test coverage. That could land first and make the rest more manageable.Comment #122
joelpittetDivide and conquer! +1 to #121
Comment #123
alexpottCreated #2557467: Deprecate check_url and convert usages to existing UrlHelper methods and #2557519: Remove many usages SafeMarkup::checkPlain() and replace with Html::escape(). The latter would make this much much easier to review.
Comment #124
alexpottAdded #2557577: Use comma separated item list in contact listing and remove a SafeMarkup::checkPlain() - one less.
Comment #125
Wim Leers#2557519: Remove many usages SafeMarkup::checkPlain() and replace with Html::escape() landed. So, soon this patch will be much easier to review :)
Comment #126
alexpottHere's a reroll. I've come round to think that #2555931: Add #plain_text to escape text in render arrays needs to go in first - or at least the part of it that makes the escaping strategy always apply. This is because this is how SafeMarkup::checkPlain() works - it does not check if something is safe before escaping.
This patch also includes fixes for #115, #116.1, #118 (with a test).
Next, I think we could look at removing SafeMarkup::checkPlain from template preprocess functions because there it is quite clear that Twig will auto-escape.
Comment #127
alexpott183 usages to go! :)
Comment #128
alexpottCreated #2557871: Remove all usages SafeMarkup::checkPlain() from template preprocess functions and attributes
Comment #130
alexpottRerolled now that #2557871: Remove all usages SafeMarkup::checkPlain() from template preprocess functions and attributes is in.
Comment #132
alexpottThe next two issues to get in are:
#2555931: Add #plain_text to escape text in render arrays
#2557467: Deprecate check_url and convert usages to existing UrlHelper methods
Comment #133
lauriiiI didn't see any discussion about this in the issue so how are we going to support the PHP templating engine if we start trusting on the Twig autoescape?
Comment #134
star-szr@lauriii I think the short answer is that ship sailed long ago.
Comment #135
stefan.r CreditAttribution: stefan.r commentedThis is blocking a critical: #2549393: Remove SafeMarkup::escape() and rely more on Twig autoescaping
Comment #136
alexpott@stefan.r no it's not - it was when this issue added Html::escape() but it no longer does that.
Comment #137
star-szrHere's a reroll after #2555931: Add #plain_text to escape text in render arrays.
Comment #139
star-szrAfter chatting with @alexpott here's a new issue and patch up for review, nice and straightforward: #2559605: Convert SafeMarkup::checkPlain() in render arrays to #plain_text
Comment #140
stefan.r CreditAttribution: stefan.r commentedGiven that last patch will fatal error now that we don't use the Markup class anymore, here's a combined patch with #2559605: Convert SafeMarkup::checkPlain() in render arrays to #plain_text that does a simple search and replace on all other instances of #markup + Markup::SAFE_STRATEGY_ESCAPE with #plain_text. Which we can't just do as some things will be double escaped, so this will give a few test failures.
Maybe let's keep future patches in this issue synchronized with #2559605: Convert SafeMarkup::checkPlain() in render arrays to #plain_text and do a reverse patch as soon as it goes in?
Comment #141
stefan.r CreditAttribution: stefan.r commentedComment #143
alexpottCreated #2560055: Remove all usages SafeMarkup::checkPlain() in DiffFormatter and SafeMarkup from the Diff component to separate out the config diff work.
Comment #144
stefan.r CreditAttribution: stefan.r commentedreverting those
Comment #145
alexpottWe've got at least 4 issues to go in before we need to return to this one...
Let's get all of those in before returning here to see what is left...
Comment #146
stefan.r CreditAttribution: stefan.r commentedComment #147
stefan.r CreditAttribution: stefan.r commentedComment #149
stefan.r CreditAttribution: stefan.r commentedper #145
Comment #150
stefan.r CreditAttribution: stefan.r commentedJust blocked on #2549393: Remove SafeMarkup::escape() and rely more on Twig autoescaping anymore, the other 3 issues have been commited now.
Comment #151
alexpottThis is where we're currently at...
I think the next two things to split up are the render array changes and the radio/checkbox auto-escaping.
Comment #153
alexpottCreated #2560751: Remove all usages SafeMarkup::checkPlain() from Views titleQuery() and #2560641: Remove all usages SafeMarkup::checkPlain() from render arrays... just need to create one for the radio/checkbox auto-escaping. And then I'll postpone again.
Comment #154
alexpott@Xano's work on #2560641: Remove all usages SafeMarkup::checkPlain() from render arrays led me to create #2560851: Deprecate EntityListBuilder::getLabel() and remove usages of SafeMarkup::checkPlain()
Comment #155
alexpottCreated #2560863: #options for radios and checkboxes uses SafeMarkup::checkPlain() to escape - use Html::escape() instead
Comment #156
lauriiiI have updated all the CRs mentioned in #94 except https://www.drupal.org/node/2153775 and https://www.drupal.org/node/1762604 where I wasn't sure what should be done.
Comment #157
mpdonadio#156 thanks for updating the CRs. I'm not totally sure about some of them, though. In [#2372691], the change was essentially
from
to
In some of these, given the newness of
Html::escape
do we maybe want to keep both example of what not to use? In other words, people reading the CR may seeHtml::escape
and not think the same thing applies to an equivalent usage of String::checkPlain? Especially since we are just deprecating SafeMarkup::checkPlain and not removing it altogether.I updated that CR as an example.
Comment #158
xjmI don't understand why we would keep any references tocheckPlain()
in any change record. If someone doesn't know what Html::escape() is, they will look it up.Oh, I read https://www.drupal.org/node/2372691 and I think I get it. If the change record is specifically about updating D7 code, then yes, having the D7 "before" snippet makes sense.
Comment #159
joelpittetJust regarding the issue summary about nl2br, l did a fix here that's pretty handy for that case.
#2554755: BasicStringFormatter's nl2br() forces escaping *and* filtering — improve this
Comment #160
alexpottProgress report in patch form. 48K not bad. Added a few
@todo test
comments where I think we should be adding a test. I guess each could be split into a separate issue :(Comment #162
alexpottoops
Comment #164
alexpottComment #165
alexpottOops
Comment #167
dawehnerAdded: #2568965: Write a test view with extensive XSS test of all the things in views_ui
I'm actually confused why we care about that. this again is JS, well we certainly don't need SafeMarkup here
lol
Comment #171
alexpottFix the test.
Comment #174
alexpottResolving a lot of the @todos in the patch by adding tests. What's left over is some views plugins which don't have tests and usernames which are special because I need to create a test module to add markup to the username. The funky thing here is what is supposed to happen? If a module is permitting markup in a username display then I guess it will be marking it as safe too since otherwise what is the point. By default no markup is allowed in a username.
Comment #175
alexpottAdded test coverage for the rest of the @todo's in the patch with two exceptions:
Also the changes to not use SafeMarkup::format in the AssertContentTrait are totally related without them the safeness of a string was getting in the way of seeing the proper message and that is just wrong.
Comment #176
dawehnerThis is honestly super confusing. the JS should do the escaping, otherwise people will escape either both or none. Nobody will remember that you need to escape just one of them.
Won't that break people who try to group by most 404 pages?
Do we really have to take care of #title ? I would have expected it to be autoescaped in page.html.twig
Too bad we don't escape in JS
We can remove SafeMarkup here as well
Yeah this is BS, ... I mean its implicit tested here. What about dropping that line of documentation?
unused
Yeah for twig!
Are you sure you don't introduce an XSS here? We have a list of checkboxes in InOperator.php, that leverages
$this->valueOptions
, so don't we need escaping?So we really support a starting slash in URLs in tests? I thought this would not be the case.
We can remove SafeMarkup here
OH, its super tricky to not accidentally think its not needed.
Why do we not always uyse item_list, once with class comment and once without it?
Comment #177
alexpottIf this will remain contentious how about we punt this hunk to the follow that will actually remove
SafeMarkup::checkPlain()
?admin/reports/page-not-found
works just fine. And syslog does a strtr so I think this is fine.Comment #182
Wim LeersNit: s/the of the/of the/
Parse error: "is being theme using"
Oh, I think I see it: s/theme/themed/
Missing blank line.
Extraneous blank line.
Fixed these nits.
This caused a different sorting, and therefore a test failure. Fixed this too.
Is there anything left that is blocking RTBC?
Comment #183
alexpottFor me the only thing that blocks RTBC is
This hunk is tricky. The dissonance between late escaping in Twig and early escaping in JavaScript is shoved right in the users face here. Given that I think best thing to will be to discuss this in the issue that actually removes
SafeMarkup::checkPlain()
since that is where this will be completely real.Also as this is not removing the method the change record work can occur with the next issue.
Comment #184
alexpottComment #185
dawehnerSo what is the problem in actually escaping in javascript? BatchAPI is a limited subset of the JS, which we could easily patch, can't we?
Comment #186
alexpottre #185 let's delay discussion of the batch API to another issue and proceed here. For example #2569699: Remove SafeMarkup::checkPlain() for Drupal 9.0.x.
Comment #187
dawehnerAgreed
Comment #188
Wim Leers+1
Comment #189
catchI had to ask @alexpott about this one.
Currently we escape on the way in - not great for syslog.
By making it a placeholder, we escape on render in dblog - which is fine.
If we didn't escape at all, it would get XSS filtered but it should be HTML escaped anyway.
We should add test coverage for this, although for me I'm fine with that in a follow-up.
'appropriate scrubbing functions' is.... but also #2567257: hook_tokens() $sanitize option incompatible with Html sanitisation requirements is open to fix the tokens confusion.
Slightly perturbed this is needed for #title but I think that was already said by dawehner earlier.
This gets us a long way forward, anything remaining we can do on the removal issue as discussed above. Committed/pushed to 8.0.x, thanks!
Comment #191
dawehner