Problem/Motivation
Follow-up to #1825952: Turn on twig autoescape by default
If drupal_render()
produces #markup
it is always safe. If, however, the caller sets #markup
it is presumed to be safe to make #1825952: Turn on twig autoescape by default reviewable.
Proposed resolution
- Run all #markup through SafeMarkup::checkAdminXss().
- Add SafeMarkup::replace() - identical to str_replace() - but if all elements are safe, mark the resultant string safe too.
Remaining tasks
None
User interface changes
None
API changes
- All #markup goes via SafeMarkup::checkAdminXss()
- New SafeMarkup::replace()
Comment | File | Size | Author |
---|---|---|---|
#320 | interdiff-dorender-final.txt | 1.97 KB | xjm |
#320 | dorender-2273925-320.patch | 20.84 KB | xjm |
#315 | interdiff.txt | 1.82 KB | effulgentsia |
#315 | 2273925-render-xss.315.patch | 20.66 KB | effulgentsia |
#312 | 2273925-render-xss.311.patch | 20.3 KB | larowlan |
Comments
Comment #1
chx CreditAttribution: chx commentedComment #2
tim.plunkettNot retitling the issue just yet. But I think that #safe_markup would be good to have, and a much more friendly API for those writing render arrays than having to instantiate a class.
And then internally, we could have:
Comment #3
chx CreditAttribution: chx commentedCritical, non beta blocker per catch.
Comment #4
seantwalshComment #5
xjmComment #6
star-szrCan we make the API addition to add #safe_markup and then convert a handful of them and then have a follow-up to convert the rest?
Another question is can we blindly replace #markup with #safe_markup, I suspect not.
Comment #7
dawehnerMuch like each
SafeMarkup::set()
we should take care and think about each particular usecase. Most of them should alwayws be replaceable with some proper templatesand we should in case this is possible.
Comment #8
effulgentsia CreditAttribution: effulgentsia commentedWhy? The semantics of
check_plain()
is that it expects plain text. The semantics ofmarkup
is that it's HTML. Would it make sense instead to runXss::filterAdmin()
on #markup? And for cases where the caller sets #markup to something containing user input, then change those usages to use#type => 'inline_template'
instead? If we did those two things, would that remove the need for a#safe_markup
property?Comment #9
aneek CreditAttribution: aneek commented@Cottser, I think that if issue #2324371: Fix common HTML escaped render #key values due to Twig autoescape gets into core then we can have a solution to this. As per my thinking
drupal_render()
sets the$elements['#markup']
and this #markup property will send HTML all the time so we can use a methodSafeMarkup::checkAdminXss()
instead ofSafeMarkup::set()
.For further reference please follow patch #142 in #2324371: Fix common HTML escaped render #key values due to Twig autoescape
Comment #10
xjmComment #11
aneek CreditAttribution: aneek commentedPosting a patch by only changing the
SafeMarkup::set($elements['#markup'])
withSafeMarkup::checkAdminXss()
check. To me, adding#safe_markup
needs more efforts and can't be done so easily.Let's go for a review.
Comment #12
aneek CreditAttribution: aneek commentedComment #14
dawehnerAnything in that area seriously needs tests, in order to be sure, what is going on.
Comment #15
aneek CreditAttribution: aneek commentedPerhaps, the generated HTML are stripping some tags that were generated via #markup. This may happen as I've used
SafeMarkup::checkAdminXss()
method. While viewing the test result,Drupal\system\Tests\Common\RenderTest->testDrupalRenderChildElementRenderCachePlaceholder()
shows false result in matching HTML generated asOther issues may exists also. Have to check thoroughly.
Comment #16
joelpittetI know I'm the original OP but wrote this follow-up because we couldn't deal with it in the "Turn on Twig autoescape" issue's scope, but I'm not convinced this 'safe_markup' key is the right direction and it's just another #key to know how and when to use.
'markup' means it will contain markup, no constraints. I know from a security perspective this is horrible and from maintenance perspective it's a pain because the markup is written in PHP strings and is unstructured/flat representation. (even in D7).
We've been using #type => 'inline_template' so far to fix some of the SafeMarkup::set() issues, and though I don't like the DX there, I would love to see #markup become that, allowing the #context/#variables to exist and be escaped via the Twig engine. Just an idea I wanted to float by y'all:
The DX seems simpler than inline_template(though it would map to the same), the data is escaped, and the raw data is still available. (Chose '#variables' instead of '#context' but could be swayed either way)
This proposal doesn't fix the security and may likely has performance issues. BUT if we want to still allow for '#markup', I don't think we can reasonably manage it's security issues it produces.
Comment #17
aneek CreditAttribution: aneek commented@joelpittet, just to be on the same page, you are saying that instead of just passing HTML to
#markup
it should be sent via the inline_template system. So in that case the existing markups and their calls needs to be changed. As an example,will become,
Or may be (another thought),
In either case, we need to re-structure the
public function render()
method and calls to#markup
. Will this be a feasible solution to do? Or to find out a way to properly check and escape existing markup calls in the render method?Thanks!
Comment #18
chx CreditAttribution: chx commentedComment #19
aneek CreditAttribution: aneek commentedComment #20
aneek CreditAttribution: aneek commentedAdding a new patch based on twig template. Instead of directly using "inline_template" twig services are used directly. Question comes,
renderInline()
directly? REF: TwigEnvironment::renderInlineReviews please :-)
Thanks!
Comment #21
aneek CreditAttribution: aneek commentedComment #23
aneek CreditAttribution: aneek commentedAny new implementation logic or new idea on the existing patch? I am sure I am missing something.
Help!!! :-)
Comment #24
joelpittet@aneek sorry for the slow replies. The idea is to be as simple as possible, if the intent is to write markup using #markup, then there would be no need to use prefix/suffix.
#type => inline_template and #markup in my mind are synonymous with one another. Though I'm not sure everybody agrees and ideally we would get rid of #markup is the goal but it's so heavily used that it would be a chore and like not a favourable outcome.
My idea is to try to encourage more responsible use of #markup by passing in the variables they can be escaped. And a side benefit is that it's a bit less to type because we don't have to declare the #type with #markup, which makes it a bit simpler than inline_template.
I don't know maybe it's a crazy idea.
Re: #20 I'd prefer to call it directly but it bypasses render cache and alter hooks that way. So that is unlikely a thing we will be doing in core.
#markup is secure as we make it. We could still concatenate strings with variables into it... which could make it insecure. Or we can escape them before hand, or let twig do that for us. I'm pulling for the latter but it may not be the best idea and there may be better ideas out there...
The reason it's not the best is in my mind because we have to enforce a convention and we will likely as developers take the shortest route to solve a problem. Though because that approach looks shorter than #type inline_template, it may be easier to adopt.
Does that help clarify my proposal?
Comment #25
aneek CreditAttribution: aneek commented@joelpittet, thanks for the clarifications. Patch #20 has some issues as you mentioned about bypassing the alters and preprocesses. Also it may pose some security violations (Not sure). However, there is surely other efficient features available.
Though, I'd love to discuss this with you in IRC #drupal-contribute.
Thanks!
Comment #26
xjmComment #27
aneek CreditAttribution: aneek commentedHad a discussion with @joelpittet. Both of us agree on removing the usage of inline_template if all of it's usability can be ported in #markup and this can serve the purpose of inline_template usages.
As example,
can be the same as inline_template usages. But
#markup
has to be properly escaped. This issue may be the right place to do this one.But first, need to find and fix issues with Renderer.php and any reasons that patch #20 has failed. Any guidance in this will be really helpful.
Love to hear more reviews and suggestions about this from other contributes.
Thanks!
Comment #30
dawehnerOne point of making inline_template not as simple as possible was that we the plan was to encourage people to use proper templates,
though I think this point is less important than the issues caused by autoescaping.
Comment #32
aneek CreditAttribution: aneek commentedGetting a bit confused here. Every time the patch is re-queued and shows new types of test fails. :-/
Any ideas?
Comment #33
cpj CreditAttribution: cpj commented@aneek, Re: #27 I agree, if #markup can do what inline_template was doing, there seems to be a good case for removing the usage of inline_template. But I also agree with @dawehner that proper templating should be encouraged where ever possible.
Comment #34
aneek CreditAttribution: aneek commented@cpj, good thought. But I doubt that "inline_template" would be removed. Had a discussion with @alexpott and seems the chances are low on removing it. But the main target should be to make this patch up and running. Do you have any suggestions on this?
Thanks!
Comment #36
star-szrTagging for https://drupaltwig.org.
Comment #39
Fabianx CreditAttribution: Fabianx commentedI believe this issue has gotten off track a lot from the original intent.
I am sorry for blowing the party ( :-D ), but I would like to try - if I may - to state again our original intent of making #markup safe.
What we want to avoid is someone to write:
where $entity->title ==
<script>alert('xss');</script>
.The situation when this issue was written was different as there was no real alternative to #markup.
But we now have #inline_template, we have templates, we have render trees, etc.
And we also solved it for #prefix and #suffix.
I think at this point all we want to do is to apply the following patch:
And thats it.
If you use it right, your #markup is already secure, hence this is fast.
If you do not use it right, then your markup goes via the XSS filter and then is secure.
I would do this explicitly and not combine with #description, because #markup needs to be a scalar. Everything else is a bug. Fortunately isset() is a no-op in terms of performance.
--
Does anyone have any objections to this really minimal impact solution to fix a (needs triage) critical?
Comment #40
Fabianx CreditAttribution: Fabianx commentedHere is a patch, lets see if it works.
Comment #42
iMiksuThis probably needs tests, I'll try to write some.
Comment #43
iMiksuThe patch in #40 seem to cause failures, but I've attached here now only with tests (which fails until proposed solution is implemented) and then with the #40 implementation.
I put status to "Needs review" for triggering testbot, the implementation still needs work I guess (as it doesn't pass the tests).
Comment #45
iMiksuApparently after retesting #40 implementation, it look like it does not cause failures anymore. Therefore status "Needs review" is valid!EDIT: Will see now with the implementation :)
Comment #47
aneek CreditAttribution: aneek commentedPatch #11 had implementation of
SafeMarkup::checkAdminXSS
but there were many failures. I think dueto adminxss elements are getting stripped off.Comment #48
kim.pepperComment #49
aneek CreditAttribution: aneek commented@Fabianx, my first implementation was to introduce
SafeMarkup::checkAdminXss()
in the doRender() method but along with huge lists of failure I figured out that some tags in#markup
are getting replaced bySafeMarkup::checkAdminXss()
.If you see the patch #11 and the comment #15 you will see
Drupal\system\Tests\Common\RenderTest->testDrupalRenderChildElementRenderCachePlaceholder()
gives false due to some tags are stripped out of#markup
.Such as
Value '<pre></pre> ' is identical to value '<pre><bar>elk</bar></pre> '.
.Since we have inline template so I've used the inline template renderer service in #20 patch.
I hope there will be a much better way of doing this. What are your thoughts on these?
Comment #50
Fabianx CreditAttribution: Fabianx commentedThere is 3 cases:
- a) Invalid markup, e.g.
'<bar>'
, if you need that you need to ensure your string is safe. If tests use that it should be replaced with something else.- b) Valid #markup that needs to use
'<script></script>'
, again in this code you need to mark it safe and be done.- c) Valid #markup that was safe, but was concat-ed in an unsafe way, hence leading to things being stripped.
Given those cases I don't think the approach to XSS-escape #markup is wrong or too much, because it will be rare to put things that are not XSS safe in #markup directly.
I think this is the main culprit, just putting #markup through renderInlineTemplate won't work as that would mean the template is secure, which it is not in this case.
I think b) should be rare in core, but a) or c) likely are problems of the test failures.
So task list would be:
- Fix test failures purely based on forbidden tags, e.g
'<bar>'
- Fix test failures that accept
'<script>'
, not sure we have that. (Edit: we have)- Fix concat-problems, where #markup is concat'ed.
The following command can be used to find concats:
Note: Not every #markup needs to be replaced with #inline_template, it is IMHO okay to still support the #prefix, #markup, #suffix version for some cases or even using String::format for simple concats.
Comment #51
BerdirSee #2446995: Block content titles are not escaped on new block form (Port SA-CONTRIB-2013-082) for an example of a security issue that this would help prevent.
Comment #52
mikey_p CreditAttribution: mikey_p commentedRe-roll for the changes to render tests.
Comment #53
effulgentsia CreditAttribution: effulgentsia commentedComment #54
mikey_p CreditAttribution: mikey_p commentedThis is still really broken in lots of places in core.
I also noticed that this broke some of the rendering tests, which seems to be related the fact that it's stripping out placeholders for the #post_render_cache.
There's also some failures in Drupal\Tests\Core\Render\RendererBubblingTest::testBubblingWithPrerender that I wasn't able to identify.
Comment #56
Fabianx CreditAttribution: Fabianx commented#54: Genius, #post_render_cache explains a lot of the many test failures ...
/me singing 'We need a placeholder API, a placeholder API, etc.'
For now I think drupal_render_cache_generate_placeholder might need a SafeMarkup::set added and that might fix a lot :).
Comment #57
mikey_p CreditAttribution: mikey_p commentedIt looks like we might be able to get this adding 'drupal-render-cache-placeholder' to the allowed list of tags, and reworking Xss::attributes() to skip that callback attribute...but then I don't know if we're getting into dangerous territory of remote code execution via render callbacks?
Comment #58
Fabianx CreditAttribution: Fabianx commentedLets see!
Comment #59
CB#58 looks good to me. Waiting on testbot still though.
Comment #61
Fabianx CreditAttribution: Fabianx commentedContextual Fixed now ...
Comment #63
mikey_p CreditAttribution: mikey_p commentedUpdated to fix failures in render tests with fake tags.
Comment #64
Fabianx CreditAttribution: Fabianx commentedThe failure in Drupal\filter\Tests\FilterAPITest shows something interesting.
Our whole text filter chain is not distinguishing between secure and insecure in and outputs - even though it could.
Obviously its possible to do:
to fix the #post_render_cache case, but this would probably break when applying another filter afterwards and this should probably be secure in the first place ...
I am assigning shortly to effulgentsia to get his thoughts / feedback on that (Thanks for all the work on String::format and l()!).
That probably warrants its own sub-issue ...
Comment #65
mikey_p CreditAttribution: mikey_p commentedIs there anything that prevents a cache placeholder from being embedded in a larger string of text within #markup? I suppose we could say that's not best practices, but it seems like our current API would totally support that, and this patch would break that.
Comment #67
mikey_p CreditAttribution: mikey_p commentedAlso I doubt that most of these failures are related to post_render_cache tokens since that's only used in a few places in core.
Comment #68
Wim LeersThis actually reminds me of #1333730: [Meta] PHP DOM (libxml2) misinterprets HTML5 — PHP being unable to parse HTML5 correctly, and #952964: String stripped from title and alt attribute if contains a colon/#2105841: Xss filter() mangles image captions and title/alt/data attributes/#2321061: Xss::split() fails on custom HTML elements with colons in the name — basically, our XSS filtering stripping things that it should not strip. Quite possibly the problem being seen here is a duplicate of that.
Comment #69
effulgentsia CreditAttribution: effulgentsia commentedThere's no corresponding
use
statement, so this is broken. Removing it from this patch, though maybe we should bring it back with a use statement?Maybe I'm missing something, but I think simply running the filters is enough to consider the output safe (if not, then that's a bug with the filters, not something for us to babysit), so here's a way to do that and fix that FilterApiTest.
Comment #71
Fabianx CreditAttribution: Fabianx commentedThanks effulgentsia!
That makes sense.
Comment #72
killua99 CreditAttribution: killua99 commentedI'm reviewing the patch. I'll bring some updates in 10 or 20 (if drush or something pass the test)
Comment #73
mikey_p CreditAttribution: mikey_p commentedSo I took a look at this today, and the very first example I ran into was Drupal\action\Tests\BulkFormTest, which tests views bulk action forms which use placeholders to place the checkboxes. The placeholder replacement works fine, but in views_pre_render_views_form_views_form() the substituted output is placed into $element['output']['#markup'], which of course strips input and label tags when rendered. Going back to the comments in #50, I'm not sure what the right approach would be here. Obviously I could just do SafeMarkup::set() on the output, but I have a few concerns about this:
1. I suspect this patterns of dumping large rendered strings into #markup is probably somewhat common
2. We can try to be careful to only use SafeMarkup::set() when we know something is safe, but it's really hard to tell what is in the rendered element. Marking such large chunks of HTML as safe could make it easier for a contrib module to modify something that gets marked as safe, when it hasn't been sanitized.
3. I have no idea if this is true, or I'm just talking out of my ass, but it seems like marking increasingly large chunks of HTML as safe could affect the performance of SafeMarkup, or it's memory usage.
Comment #74
mikey_p CreditAttribution: mikey_p commentedTo expand on #2, we're assuming that everything coming out of drupal_render is safe, and I don't know if that is a safe assumption or not. I guess overall, we should probably try to stick to our general principle of not trying to overly babysit contrib. At least with this new API, we can search of usages of SafeMarkup::set() and check to see if output is truly safe or not.
I suppose if we ensure that only output directly from drupal_render is passed to SafeMarkup::set, we should be reasonably safe after this issue is finished.
Comment #75
mikey_p CreditAttribution: mikey_p commentedFix for views_pre_render_views_form_views_form().
Comment #76
effulgentsia CreditAttribution: effulgentsia at Acquia commentedWhat views_pre_render_views_form_views_form() does seems very similar to what String::format() does, except the tokenized string isn't a literal, and the token names don't all start with the handy @, !, % prefixes. What about adding a new method to the String class, like
replace()
, and have it treat each substitution token as the equivalent of a pass-through (but without requiring the ! prefix)? And then having String::replace() only mark the output as safe if 1) the tokenized string is already marked safe and 2) every substitution parameter is already marked safe?Comment #78
mikey_p CreditAttribution: mikey_p commentedThat sounds pretty similar to the SafeMarkup::concat stuff that was confusing earlier, but I think this is better, for the following reasons:
1) It wouldn't actually mark anything as safe, unless all of it's components were already safe
2) It'd be setting a good example and better API for contrib developers (given the security nature of this task, is pretty important).
Comment #79
mikey_p CreditAttribution: mikey_p commentedComment #81
aneek CreditAttribution: aneek commented@mikey_p,
Applied patch #79 and had a look at the fails (Drupal\dblog\Tests\DbLogTest).
In
/core/modules/dblog/src/Tests/DbLogTest.php
@ line # 321 we can see that the test suite is checking for database logger event. The event message is generated by,code and it should generate a proper HTML Markup. Instead it's generating,
This is double escaped. The next code block that used
html_entity_decode
can't properly decode it. So,fails.
As per my thoughts adding
SafeMarkup::set()
can solve this issue. But I don't think addingSafeMarkup::set()
in those areas (many cases) where double escaping is happening is worth.Comment #82
mikey_p CreditAttribution: mikey_p commentedLooked into some more of these and a bunch are due to the fact that this broke the BatchAPI somehow. I'm pretty baffled as to what's going on, when I check the verbose messages, I'm seeing that the batch error is a 406 Not Acceptable. I'll have to look into this some more later, but thought I'd throw this up in case someone elese wants to look into the failures.
The worst part is that unlike the other errors, I can't recreate this one with manual testing. It appears to be some artifact of the way simpletest is executing the batch.
Comment #83
Wim Leersnull
is listed as one of the acceptable MIME types in that screenshot. That's highly suspicious.Comment #84
mikey_p CreditAttribution: mikey_p commentedThat took me a lot longer to figure out that I would like to admit. The whole 406 thing was a bit of a red herring, due to JS triggering on the verbose debug pages. This should fix a number of the config import tests, batch test, node access rebuild test, etc.
Comment #86
aneek CreditAttribution: aneek commented@mikey_p, great find. But still there are 2 issues.
These might need some attention. I hope contextual link was solved in patch #61? But removed later on somehow.
Comment #87
Fabianx CreditAttribution: Fabianx commentedYes, lets bring #61 back with a use statement :).
Comment #88
Fabianx CreditAttribution: Fabianx commentedHere is an analysis of all SafeMarkup::set().
TL;DR, only Filter and String class should need to use SafeMarkup::set(), the rest more high-level API functions.
This is fine for ::set
This can use String::format with ! as $markup was already marked safe.
This can use String::format with % for callback and token, should be safe to check_plain() it being base64 and a valid PHP function.
This means the text was filtered for a known text format.
It is safe by design in that case.
',
If we want to keep this can use String::format with % for the context['bar'], but this is fine, too.
Comment #89
lauriiiDouble escaping of characters at Drupal\rest\Tests\Views\StyleSerializerTest still needs to be figured out
Comment #90
aneek CreditAttribution: aneek commentedPosting a patch addressing some of the findings by @Fabianx.
Btw, @Fabianx why use
%
placeholder forString::format()
? This will runString::placeholder
method to add extra<em class="placeholder">
. So I've used@
and the string in it will be check plain'd.Haven't changed the
<bar></bar>
tags. Lets see how it runs through and then in the next patch we can address those.FYI, double escaping is not solved yet. Checked locally. Have to debug
Drupal\rest\Tests\Views\StyleSerializerTest->testFieldapiField()
throughly.Comment #91
aneek CreditAttribution: aneek commented@lauriii, my friend, sorry I posted the same patch (only different in some placeholders). Can you please make your patches for review too?
Comment #92
lauriiiI think it got messed up somehow because of double patches. Probably should reupload?
Comment #95
aneek CreditAttribution: aneek commented@Fabianx,
HtmlTagTest
failed as we implementedString::format()
with!
placeholder. MaybeSafeMarkup::isSafe
is not returning the string as safe so this is happening. Trying with inline_template instead of String::format. Let's see :-)Comment #96
aneek CreditAttribution: aneek commentedComment #98
Fabianx CreditAttribution: Fabianx commentedHm, I don't get it, why !markup is deemed not safe. Will do a local test later today.
Comment #101
joelpittet@Fabianx I believe that is due to a change that says you need to wrap the markup as SafeMarkup before passing it through to format.
#2399037: String::format() marks a resulting string as safe even when passed an unsafe passthrough argument
I've been saying it's not the right way to go but seems not many agree.
Comment #102
dimaro CreditAttribution: dimaro at La Drupalera by Emergya commentedThe latest patch does not apply, I have the hope that this reroll work.
Comment #104
dimaro CreditAttribution: dimaro at La Drupalera by Emergya commentedTrying to reroll again #95 moving the functions testReplaces and providerReplace from StringTest to SafeMarkupTest and function replace from String to SafeMarkup.
Comment #106
dimaro CreditAttribution: dimaro at La Drupalera by Emergya commentedI've replaced all String calls to SafeMarkup for avoiding fatal errors. Trying again!
Comment #108
Anonymous (not verified) CreditAttribution: Anonymous at Druid commentedthis causes a hefty amount of exceptions as the #markup is expected to be a string
could probably be
Comment #109
dimaro CreditAttribution: dimaro at La Drupalera by Emergya commentedSend new patch to try to correct the multiple exceptions, but any ideas for errors?
Comment #110
dimaro CreditAttribution: dimaro at La Drupalera by Emergya commentedSend new patch to try to correct the multiple exceptions, but any ideas for errors?
Comment #112
joelpittetThere are probably many other things but this is just a quick pass on that last change. A style note, if you are changing an hunk that has an array in it, change the array style to the short syntax(but don't change arrays outside the hunk or it will introduce unnessasary noise and conflicts).
Before there was no space around the markup variable. And an extra space showed up after the = sign.
Use array short syntax here.
['markup' => $markup]
please.Periods at the ends of sentences.
Use double quotes on the inside and avoid the unnecessary escaping.
Comment #115
dawehnerStuff like that seriously should have some kind of documentation.
Also for stuff like that documentation would be great, especially in the context of #2280965: [meta] Remove every SafeMarkup::set() call
Comment #116
larowlanFixes 115 and 112
Comment #117
larowlanComment #121
effulgentsia CreditAttribution: effulgentsia at Acquia commentedI discussed this with @alexpott, @xjm, and @webchick, and we agreed to keep this prioritized as Critical, because it is so common for modules to set #markup to a variable that might or might not contain user input that failing to fix this will likely result in many XSS vulnerabilities in contrib (and quite possibly, some in core). If we get close to an RC without this being close to fixed, we could instead document somewhere very prominently that it is the caller's responsibility to only set #markup to trusted or sanitized strings (as it is in 7.x), but it would be so much nicer to close off this vector instead.
Comment #122
larowlankicking tyres
Comment #123
larowlanReverts 95, more tomorrow
Comment #124
larowlanperchance kind bot
Comment #126
lauriiiwoot! Only 9 fails left :)
Comment #127
aneek CreditAttribution: aneek commentedTried to reduce the remaining issues and found the following.
The issue:
Drupal\rest\Tests\Views\StyleSerializerTest
Value '<p>ujTalrVZ2au3FQyRgGIiuXnxVK5H9Qz9</p> ' is equal to value '<p>ujTalrVZ2au3FQyRgGIiuXnxVK5H9Qz9</p> '.
While debugging I found
RestExport::execute()
in/core/modules/rest/src/Plugin/views/display/RestExport.php
returns an JSON data.This calls
doRender()
method with$element['#markup']
set as[{"nid":"1","body":"<p>haQmpp1G8d6zOAAmbemKiFwYiDokerkp<\\/p>\\n"}]
which is a JSON. So inSafeMarkup::checkAdminXss()
this returns as FALSE bySafeMarkup::isSafe()
.This is why the JSON string is double escaped and test is failing in
StyleSerializerTest::testFieldapiField()
@ line 317.I hope this can be useful. I will try to debug/fix this or more later this week.
Love to hear about it's solution and thoughts from others :-)
Comment #128
star-szrMay not be relevant but thought it worth mentioning: Twig does have different escaping "strategies", including one for JS.
http://twig.sensiolabs.org/doc/filters/escape.html
Comment #129
aneek CreditAttribution: aneek commented@Cottser thanks for this link. This is also mentioned in the SafeMarkup.php file. This may come handy if we have to remove
SafeMarkup::set()
in this issue.Consider the following case, (Same as the test case
testFieldapiField()
)The data is a JSON format and sent via the REST and it will be double escaped. So there can be two things as per my understanding.
$strategy = 'json'
inSafeMarkup.php
RestExport::render()
should consider escaping JSON value while returning it to Drupal render.Please suggest if these seems good or can have a better approach.
Thanks!
Comment #130
larowlanIndividual fields in the DataFieldRow plugin are sanitized in
\Drupal\views\Plugin\views\field\FieldPluginBase::advancedRender()
and we can safely assume that the Serializer does not introduce XSS when transforming the array into the particular format, hence we can safely mark the whole serialized string as safe.Added a test to verify that.
Also fixed AreaTest
More tomorrow
Comment #131
larowlanComment #133
aneek CreditAttribution: aneek commented@larowlan, fixed the "StyleSerializerTest" error. Uploading a new one.
Comment #134
aneek CreditAttribution: aneek commentedComment #136
aneek CreditAttribution: aneek commentedBased on #2435493: Change String::format()'s '@' and '%' placeholders to be auto-escaped rather than always-escaped, '@' will not double escape an already escaped/marked safe string. Uploading a new patch to fix HtmlTagTest.php error.
Comment #137
aneek CreditAttribution: aneek commentedComment #139
aneek CreditAttribution: aneek commentedAhh, code had a bug. Uploading again.
Comment #140
aneek CreditAttribution: aneek commentedComment #142
aneek CreditAttribution: aneek commentedAhh, crap. Not fixed. I will debug more. In the mean time please use patch #133 for now.
Comment #143
larowlanWhy a separate method, this means a whole new Drupal install just for this tiny test.
Comment #144
larowlanWe should do this for the title as well, body is a formatted text field, so would already have check_markup equivalent called on it. Using title as well would add extra safeguard
Comment #145
aneek CreditAttribution: aneek commented@larowlan,
For #143 - It seemed that using
$result = $this->drupalGetJSON('test/serialize/node-field');
twice was returning the previous node's results. So I made the tests different. Well you are right with burdening the system, I will try to merge these 2 functions together and check.For #144 - Yes we could add an XSS check for the title as well.
Comment #146
chx CreditAttribution: chx commentedI am slowly getting back to this issue, the two tests in #133 , one is trivial to fix, we discussed with aneek (who is very awesome for doing this finally!). As for other, the Views fail in #133 is due to the broken page attached. Now, in views_ui.theme.inc we see these:
The test is missing
It stands to reason the drupal_renders either come back empty handed or they just get filtered out wholesale. Haven't really dug what happens but I feel someone who is more into this can easily carry the issue home from here -- if not I will take a deeper look tomorrow.
Comment #147
aneek CreditAttribution: aneek commentedUploading a new patch. Considering,
StyleSerializerTest
but this gave birth to a new major issue.#2477157: rest_export Views display plugin does not set necessary cache metadata We should follow-up on this. @todo comment is also added in this patch.
Thanks @Berdir & @Fabianx.
Comment #148
aneek CreditAttribution: aneek commentedComment #150
aneek CreditAttribution: aneek commentedOnly
FilterBooleanWebTest
issue is left. If someone with knowledge in views can address this will be great as @chx said.Comment #151
chx CreditAttribution: chx commentedComment #152
joelpittet@chx nice, that is interesting, I did a very similar shuffle of that same remove link over here at one point in time: https://www.drupal.org/node/1963978#comment-9200453
And this related patch may conflict as it's fixing that JS to actually work for this VDC modal:
https://www.drupal.org/node/2168893#comment-9856001
Can we avoid the explicit
drupal_render()
and put it as'default' => ['data' => $default],
and'remove' => ['data' => $remove],
Comment #153
lauriii@joelpittet it should be working. Should we also remove all 20 other drupal_render calls from there or should we just open a follow-up for them?
Comment #154
joelpittetFollow up for sure:) I treat those changes the same as short array syntax to avoid needless rerolls of other patches on unrelated changes. Changing them only in the changed hunks.
Comment #155
lauriiiI think we have tests from #43
Comment #156
star-szrIt's green! Thank you everyone. This is looking good to me overall, some minor points below.
Minor: missing trailing comma per https://www.drupal.org/coding-standards#array.
Minor: Mixing short and long array syntax, why not just go short?
Huh? I think this comment can be improved :)
Comment #157
larowlanfixes #156
Comment #158
joelpittetOk going through this with a bit of a fineish tooth comb for what I can catch/grasp:
Need to document this one still.
Are these unset()'s needed?
Nit: No need for colon and yes to capital 'Find'.
This change needed?
Comment #159
lauriiiComment #160
lauriiiChange 4. is needed to make tests pass. randomString can contain characters that are being autoescaped which makes tests fail
Comment #161
dawehnerSome questions
Mh, that feels kinda odd honestly, the serializer style plugin should not care about its individual row plugins. Why are we not able to mark both the DataFieldRow and DataEntityRow as safe? ... The problem is that contrib has no way to fix the problem, in case they have the same.
Should we not rather escape the expected output to avoid the random test failures? By doing that, we would ensure that the escaping actually happens. By using
randomMachineName()
this information is not given.Comment #162
larowlanpoking
Comment #163
larowlanwow styleserializertest is so slow to run locally
Comment #164
larowlan#163 fixes #161
Comment #165
dawehner@larowlan It would have been nice if you would have explained, why you made some of the changes :)
What is the movitation to remove the drupal_render_root() calll, ... Is there really a technical need? Note: We might want to bubble up some metadata at some point, and then not calling could be a problem. I mean do we need to change RestExport at all here, given that it is just one unnecessary conflict with 2477157
So we don't actually need that?
For testing purposes it feels better to rely on something which doens't change state internally but I guess there is no way to do that :(
Comment #166
larowlanSorry, should have explained.
Yeah we don't need the Drupal render call, it's already a string (XML/json) that's been rendered once already, but then shoved into #markup which gets rendered again. Agree on the metadata so added the to-do.
Comment #167
Wim LeersWhy not just return a
CacheableResponse
instead? Why defer the cacheability metadata to a todo?Comment #168
dawehnerBecause, its entirely out of scope of this issue! Even more, we have a dedicated issue to just do that: #2477157: rest_export Views display plugin does not set necessary cache metadata, which is RTBC
Comment #169
Wim LeersLOL, right, that's why I was confused then :P I was reviewing both issues after each other, and got confused about that :D I even thought that the @todo linked to this issue.
/me goes to get lunch, need more sugar for thinking :P
Then I think this patch is close to ready?
Comment #170
dawehnerWell, certainly the issue summary seems entirely out of date.
Comment #171
Wim LeersYes, indeed. Just referring to the patch itself now.
Can somebody please update the IS?
Comment #172
larowlanUpdated issue summary
Comment #173
aneek CreditAttribution: aneek commentedComment #174
aneek CreditAttribution: aneek commentedMy previous comment deleted somehow.Anyway,
#2477157: rest_export Views display plugin does not set necessary cache metadata is now fixed so we have to remove/refactor the following,
And
Thanks!
Comment #175
larowlanComment #176
larowlanNot sure if this will fail until such time as #2352009: [pp-3] Bubbling of elements' max-age to the page's headers and the page cache is in.
Comment #177
larowlanNot sure how the credit thing works. How can you tell it the first 5 patches were done on my own time and the last one was done on on company time? Or is not that sophisticated?
Comment #179
larowlanComment #182
dawehnerWell, I would think given that the attribute thing is a checkbox now, you can simply tick both.
Can't re revert all this now? I don't see an advantage ... well $output is a render array and so to convert it to a string you call a render function ...
Comment #183
larowlanWe can revert this
But I don't think we can revert this
I think reverting the second hunk will result in the stuff that's in $output['#markup'] being escaped twice.
Comment #184
dawehnerWait, so running the serializer also escapes certain entries? In case it does, I would expect that the same string is safe already, so double escaping should not happen. It seems to be at least worth documenting what is happening.
The reason why I think we should keep the call is that we want to ensure that potential processing done by the renderer, which might be important in the future, is executed.
Sorry for maybe being annoying here.
Comment #185
larowlanLet's see
The issue is, the components inside the serialized string are escaped, but not the whole string.
So the whole string isn't marked as safe (the xml/json meta bits aren't marked safe) but the field values (that went via rendering) are - so the aggregate string isn't previously marked safe.
Comment #187
larowlanFailed because of double escaping
Comment #188
larowlanComment #189
larowlanforgot to merge
Comment #190
larowlanwrong file again
Comment #191
Fabianx CreditAttribution: Fabianx as a volunteer commentedSo far we have treated the concat of safe strings as safe as well.
Maybe we need to do so in the renderer as well?
Comment #192
MiroslavBanov CreditAttribution: MiroslavBanov commentedI saw it mentioned in #112 that short array should be used on changed lines but I see it in 6 places on last patch:
+ $element['output'] = array('#markup'
+ $default = array(
+ $element['#markup'] = SafeMarkup::format('<noscript>!markup</noscript>', array('!markup'
+ return SafeMarkup::format('<drupal-render-cache-placeholder callback="@callback" token="@token"></drupal-render-cache-placeholder>', array(
+ $element['#markup'] = SafeMarkup::set('<div' . new Attribute(array('data-contextual-id'
+ $node->body = array(
Not a big deal for me but I am curious about the overall policy on this.
Comment #193
lauriii@MiroslavBanov there is no solid decision about that but I've seen most of people using the new array syntax on the changed lines so in some point we would have every piece of code using the new array syntax at some point.
Comment #194
dawehner#2135291: [Policy, no patch] PHP 5.4 short array syntax coding standards for Drupal 8, short story: Cool, if you do so, otherwise nevermind.
Comment #195
MiroslavBanov CreditAttribution: MiroslavBanov commentedPretty good summary, thanks :).
Comment #196
googletorp CreditAttribution: googletorp at Reveal IT commentedMissing php doc for params and return.
Comment #197
larowlanFixes 196
Comment #200
lauriiiI think change record would be quite useful for this
Comment #201
dawehnerAdded a quick one for
SafeMarkup::replace()
Comment #202
lauriii@dawehner++
We could still add another for the fact that #markup is being escaped automaticly. That might be very confusing for people.
Comment #203
dawehnerTwo change recoreds got ridden: https://www.drupal.org/node/2489968 https://www.drupal.org/node/2489900
Comment #204
lauriiiIssue summary looks good to me so no need to update that. We have also the change records here so this is RTBC to me.
Comment #205
xjmNice work on this. Overall I think this is pretty much ready. I like the DX of
SafeMarkup::replace()
and I think this is a good security improvement as well. It's also great to see the detailed comments justifying each use ofSafeMarkup::set()
as this will ensure best practices are followed and that we aren't unintentionally leaving unsafe things unescaped.The change records look great, but can we move https://www.drupal.org/node/2489968 into the main SafeMarkup CR?
Also, it would be good to see some before-and-after testing of the Views UI stuff to ensure we're not (e.g.) introducing double-escaping or other formatting errors. (40% of the time I spent reviewing this patch went into parsing that diff with the array reformatting. It's definitely more readable now but it was hard to parse out the actual changes.) Thanks!
Finally, some extremely minor nitpicks that are totally not worth blocking this on, but that @jsmith said he could fix for us here at the LA extended sprints. :)
Nit:
str_replace()
should have parens so it gets linked automatically to php.net (at least this used to work).Nit: Comma splice.
Had to read this twice. If any one of the replacements is unsafe, then we don't mark anything new as safe. I'd add a comment before the
if
something like "If any replacement is unsafe, then the output is also unsafe, so just return the output."Nit: This is minorly backwards from the rest of the method and it might be more intuitive if it were the other way around, so that doing the
SafeMarkup::set()
is the last thing.Nit: instances. I also got a bit confused reading this. I'd reword this as: "This markup is safe because the only arguments will be instances of \..., which is passed through \...checkPlain().
Nit: filtering and sanitizing have been done. Also what does "its" refer to?
Note to me: look up the intent of the use of
<bar>
in these tests.Comment #206
jaredsmith CreditAttribution: jaredsmith at Bluehost commentedUpdating patch to address @xjm's comments in comment 205. Note that I didn't make any changes for part 7 of the code review, as it was a note for @xjm to check before committing.
Will work on manual testing and change records next.
Comment #207
YesCT CreditAttribution: YesCT commented@jsmith thank you
since we are nitting and fixing immediately... the added () push this (further) over 80 chars.
bit awkward wording.
How about "is safe because the arguments will only be instances of"
Missing a period at the end of the sentence.
rest of the changes look great.
Comment #208
jaredsmith CreditAttribution: jaredsmith at Bluehost commentedUpdating the patch based on the comments on 207. Also re-doing the interdiff (from 197 to this patch) properly to make it easier to follow. I'll plead ignorance that my interdiffs weren't being done exactly correctly, and repent of my sins :-)
Comment #209
jaredsmith CreditAttribution: jaredsmith at Bluehost commentedAdding an interdiff between 206 and 208 for @YesCT.
Comment #210
YesCT CreditAttribution: YesCT commentedI read the interdiff. Those changes look great.
We are now going to work on the change record.
Comment #212
YesCT CreditAttribution: YesCT commentedWe found the change record @xjm asked "#markup is now automatically escaped" https://www.drupal.org/node/2489968 to be merged into: "Twig autoescape enabled. New SafeMarkup class added." https://www.drupal.org/node/2296163
@jaredsmith is doing that now.
I'm looking into the fails.
Comment #213
star-szrFantastic team work everyone! Just want to note that @larowlan's patch in #197 had those same fails until I hit retest. Potentially a random factor somewhere worth testing.
Comment #214
YesCT CreditAttribution: YesCT commentedthe interdiff since 206 should not have caused something different in the test results....
fails in #208 are
FAILED: [[SimpleTest]]: [PHP 5.4 MySQL] 94,399 pass(es), 2 fail(s), and 0 exception(s).
#206 was PASSED: [[SimpleTest]]: [PHP 5.4 MySQL] 94,377 pass(es).
"false is TRUE"
these could be clearer as to what they are trying to assert, and have a better assert message. "string from the header does not appear" ... ??
--
taking a break. anyone else is welcome to look into this.
Comment #215
larowlanSee #161.2 and #163 - personally I prefer just making it ->randomMachineName() to remove chance of randoms
Comment #216
larowlan/me takes baton
Comment #217
larowlanThis gives us the piece of mind that escaping happens, without relying on the randomness of randomString() - which could throw out markup on some instances (eg
&, < or >
).Passes locally.
Comment #219
larowlanshould have checked the diff first, was too aggressive with cleanup - this is what I meant to submit
Comment #220
YesCT CreditAttribution: YesCT commentedThanks!
1.
We should still fix the confusing "false is TRUE" assert messages and make them mean something.
do these need to be unique? boo1, boo2, boo3?
Comment #221
YesCT CreditAttribution: YesCT commentedMerged the change record and deleted the unused one.
Added this issue to the related issues for that main change record.
Please,
check the merge at https://www.drupal.org/node/2296163/revisions/view/8472544/8473166
(Since we are changing and existing change record, to get it tweeted out, we can toggle "published" to trigger the actions to announce this when it goes in.)
Comment #222
larowlanFixes #220.1
220.2 - nope, don't need to be unique as $this->randomMachineName() ensures that.
Comment #223
larowlanMade more changes to CR - https://www.drupal.org/node/2296163/revisions/view/8473166/8473230 looks good
Comment #224
YesCT CreditAttribution: YesCT commentedok. I looked and everything @xjm asked for in #205 is done. so re-rtbc'ing
Comment #225
catchHave given this a once over and only found one issue which is fixable on commit. Bit too early in the morning for me to want to commit this without another review, so I'll try to look again later if no-one's beaten me to it. Also think the DX of SafeMarkup::replace() is decent.
Nit: should be testReplace()
Comment #226
jcnventura CreditAttribution: jcnventura commentedWell, if you want to be pedantic, you can always convert the arrays to short syntax...
Comment #227
YesCT CreditAttribution: YesCT commentedI can go ahead and change the name now.
Comment #228
YesCT CreditAttribution: YesCT commentedoh... sorry. I get it. It was a test name and only there in that one place. well, anyway, here it is.
Comment #229
alexpott@YesCT's is what @catch suggested.
Comment #230
YesCT CreditAttribution: YesCT commented@jcnventura Both array syntaxes are allowed right now. Until something is decided in #2135291: [Policy, no patch] PHP 5.4 short array syntax coding standards for Drupal 8. So no need to change that here.
Comment #231
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedAfter looking it over again: RTBC + 1, great work everyone!
Comment #232
xjmReviewing (again) on an airplane. Whee!
Comment #233
xjmSorry... I found more things. A lot of this is very much followup material, but some of it is docs gate, and some of it is about the robustness of the test coverage for Views specifically.
Well done on tracking down the random fail!
I had a good solid think about whether it was correct to use
randomMachineName()
instead here, and I think it is. When it fails randomly, Views is (correctly) converting e.g.<
to<
. (Related issue: #2487498: Make randomString always return a > to avoid random test fails.)The assertion messages are a definite improvement, as it also makes it explicit what the intent of the check is. That said, I think either they're not quite complete, or the inline comments around them in this test need to be updated.
The original intent of the test was just to verify that each area got rendered properly. Now, we're implicitly adding that it should also check that it protects against XSS. Wouldn't it be better to test for the expected escaped string, rather than checking the results of
Xss::filterAdmin()
? And maybe expand the assertion messages (and the inline comment above them) to indicate we're testing both that it's sanitized to remove the script tag, and that it's rendered.Nit: Replaces.
Minor: Move me up above
if (!is_array($replace)) {
since it helps explain both theif
and theelse
.This should go down inside the final
else
now.Needs followup: the
#noscript
element is not documented at alllll.So, here's a followup I'd like to see:
It would seem sane to add documentation that
#markup
goes throughcheckAdminXss()
, wherever we document#markup
. Which appears to be intheme.api.php
in the theme and render topic group, and took me ages to find. (I looked in ElementInfoManagerInterface::getInfo() andElementInterface
andRenderer
andRendererInterface
and several other places...) It says:Is that... true and complete? Still even actually best practice? Should it mention, you know, templates and inline templates? Can we consolidate links to all the things and add references from the places I originally looked back to this topic? And shouldn't
#markup
be documented/mentioned somehow wherever the magic happens, rather than only in the topic documentation?In any case, followup aside, I think it's a docs gate thing to document that the
#markup
now goes throughcheckAdminXss()
. So for this issue, we can presumably just start by updating that bullet intheme.api.php
to mention this.Also maybe in scope for this issue:
HtmlTag::preRenderHtmlTag()
says:So, that callback still doesn't perform sanitization. True. But admin XSS filtering action still does happen later in the
Renderer
, I think? Can someone who really groks this confirm one way or another and suggest how the docs there should be updated?Also sorta contradicts the earlier thing about it being the caller's responsibility to sanitize. Maybe we should clarify exactly what filtering and sanitizing have been done, and what exactly is still the caller's responsibility? (Like, don't use
#markup
if you need itcheckPlain()
'ed, obviously.)Once, not too long ago, this
use
statement caused Views tests to explode with PHP crossing this with theXss
views plugin. It wasn't quite the same situation, but are we safe from that here? See #2420421: HEAD BROKEN: Fatal error: Cannot use Drupal\Component\Utility\Xss as Xss because the name is already in use in /var/lib/drupaltestbot/sites/default/files/checkout/core/modules/views/src/Plugin/views/field/Field.php on line 11. Or, this becomes moot if we remove the direct usage ofXss
(see my first comment above).I left @jaredsmith a tell about this, but did anyone ever get to that manual testing in Views UI to ensure there are no markup or escaping regressions for the remove link thingy as per my comment in #205?
Non-actionable remark: I still haven't refreshed my memory on the whole history of these nor read all of
RendererPostRenderCacheTest
for context, but I got as far as checking that these tests with the<bar>
tag came originally from #2324371: Fix common HTML escaped render #key values due to Twig autoescape and/or #2118703: Introduce #post_render_cache callback to allow for personalization without breaking the render cache (or those are related at least).SafeMarkup::format()
's vastly improved and de-fragilified (probably not a word) use of the tokens and how they behave with theSafeMarkup
list.Edits: typos and formatting.
Comment #234
xjmComment #235
jaredsmith CreditAttribution: jaredsmith at Bluehost commentedI did indeed do quite a bit of manual testing on this issue at the extented sprints at DrupalCon Los Angeles -- but it seems my comment didn't get saved.
On the advice of @joelpittet, I first looked at the Views UI without the patch installed. In particular, I looked at the People (Users) view, as it has a filtered group. I ensured that I was able to delete a filter (whether or not it was grouped), and that none of the text seemed to be double-escaped.
Next, I applied the patch (my patch from comment #208), and did the same thing -- namely, I looked at the People view UI, added a filter (grouped), removed it, and saw that I could remove other filters as well. I also looked for any text that was double-escaped, and couldn't find anything.
Comment #236
lauriiiI think I fixed points 1-4 from #233. I checked point 8 and it shouldn't be a problem because its a test and in that namespace there is no Xss class.
Comment #237
larowlanFollow-ups
So @lauriii did 1-4,8; I've done 5,6,11; @jaredsmith did 9; and 7 and 10 are non-actionable - I think we're good again here.
On #233.7 - the comment there is correct. As to is the comment on HtmlTag::preRenderHtmlTag - I've expanded the later to make it clear why 7 is correct.
Comment #238
larowlanComment #239
xjmThanks @lauriii, @larowlan, and @jaredsmith! That all looks great. Just a few comments:
Minor: Method names should have () and there is a comma at the end instead of a period.
Minor: parens after the function/method name.
Should
<script>
and<style>
have their angle brackets? (api.d.o escapes it properly; see ImageStyleInterface::buildUrl() for example.)The other was a test too, with no direct indication of any other use of any other Xss class in its namespace. :( But I'm probably being paranoid just because the bug was so weird. Core usage suggests so:
Any thoughts on:
(The answer might be "No it would not be better".) :)
FilterInterface
did? (This might be out of scope; I can't remember right now.)Comment #240
googletorp CreditAttribution: googletorp at Reveal IT commentedI addressed 1-3 from #239, not sure what to do about 4-6 if anything. Also fixed some array syntax, to use the short syntax instead of the old one.
Comment #241
xjmOh, regarding #233 point 10 and #205 point 7, what I'm asking myself there is whether the test is still providing the intended coverage when we are replacing first one and then the other arbitrary tags (
<foo>
and<bar>
) with actual tags defined in HTML5. So others can feel free to investigate and opine on that as well. :)Thanks @googletorp for the updated patch! For 4-6 I think more feedback is needed.
We really should actually avoid doing this when updating an existing array, even when we are otherwise changing the line. I often review with a word diff and converting array syntax adds irrelevant stuff to the diff. (This is also part of why in #205 I asked for manual testing of the bit in the Views UI.)
If others have used
array()
for new lines in a patch, that's okay and we don't need to change their preference with #2135291: [Policy, no patch] PHP 5.4 short array syntax coding standards for Drupal 8 still allowing both. And when we're changing lines in HEAD, we should make the minimum change possible.I do personally prefer the short syntax and the added uses here are okay I guess. It's more about scope management and reviewability in general. :)
Comment #242
googletorp CreditAttribution: googletorp at Reveal IT commented@xjm Thanks for clearing up
array()
vs[]
. I've had quite a few patches bumped from RTBC to needs work due to old array syntax, so I thought that short array syntax was required in new patches for consistency or something.Comment #243
xjmThanks @googletorp! Sorry to hear about patches being bumped back for the syntax; that's part of what I want to avoid for sure. If you see a patch set to NW for that in the future, you could suggest the reviewer talk to me, alexpott, etc. for clarification. Hopefully the policy issue link is a useful resource there (although it adds its own problems since it's still a subject of debate).
I'm updating the summary with the remaining tasks after @googletorp's updated patch.
Comment #244
xjmComment #245
xjmComment #246
xjmOh, looks like we need to rewrap the comment here since this is now over 80 characters.
Comment #247
larowlanaddressing open items
Comment #248
larowlan#246 fixed
#233.10 and #205.7 the bar isn't markup, it's a post_render_cache context. See \Drupal\Tests\Core\Render\PostRenderCache::placeholder - the intention is the context is embedded inside lt;code> tags.
#239.5 - added assert to ensure script tag was escaped
#239.6 - after reading this I'm not convinced we should be marking the result of the filter as safe - just because it has been through the OO equivalent of check_markup doesn't mean its safe - consider an input format with no html filter (ala Full HTML). So I added some extra logic there - this may break tests. But it may also allow me to tighten one of my biggest gripes with the standard install profile - the full html input format.
Comment #249
larowlanComment #251
larowlanSo this is why the test fails.
But I don't think we can say that just the text is safe because the text has been filtered via the configured filter format. What if my filter format is Full html? Does that guarantee its safe? Or do we assume the user knows what they're doing?
If so, reverse that change.
If not, we've got issues, because post render cache callbacks are stripped otherwise.
Comment #252
Wim LeersIndeed. And even the presence of a HTML-restricting filter does NOT imply the resulting HTML can be considered safe either.
Comment #253
BerdirI think that's exactly what it should mean?
After a text has been checked with a text format, it *must* be printed as-is and should be considered safe. If the text format is misconfigured, so be it, there's no way for us to know if that's the case or not. Even a very simple text format might have some sort of placeholders/token that will result in all kind of HTML, javascript and so on, and that needs to be considered safe?
If a user has access to the full html text format and decides to use it, then everything is allowed. Whether that user should be able to use that text format or not is a completely different question.
Comment #254
Wim LeersHaha, you're right, of course.
Text formats/processed text deals with user input. There is no automagic way to guarantee safety. Some filters do some level of HTML restrictions. But you still cannot know if that just means "disallowing the
<marquee>
tag" or "disallowing all unsafe input". Your site's configuration of text formats reflects which users you trust blindly, and which you don't. But in the end of the day, what comes out of the filter system, is what must be displayed, without further post-processing.(I answered #252 purely from a "safe markup" POV, which text formats absolutely do not guarantee 100%, because the site builder can surely change the configuration of text formats which would make them unsafe. But that, as Berdir points out, is a completely different question.)
Comment #255
larowlanOK so we can reverse that hunk and use @berdirs comment as basis for documenting the new safe markup set call, will do this tomorrow
Comment #256
larowlanDoes #255 as promised, think this is ready again
Updating remaining tasks
Comment #257
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedReviewed again, back to RTBC!
Comment #258
lauriiiRTBC++
Comment #259
Wim LeersSorry, but I think I need to kick this back.
Hrm, doesn't this mean that certain HTML tags are now effectively impossible to render? I.e. any tag that is not in
\Drupal\Component\Utility\Xss::$adminTags
?A simple example would be custom HTML tags. I'd like to see a test disproving me :)
Why this change? In the old code,
#post_render_cache
callbacks were being executed. Not anymore now. Or is arenderRoot()
call already happening elsewhere?Unnecessary changes.
This is only part of the answer to my first question.
Also: this text is misleading, because it suggests a blacklist is used, but actually a whitelist is used.
The comment is now wrong.
Comment #260
Fabianx CreditAttribution: Fabianx as a volunteer and commented#259:
1. Use:
$element['#markup'] = SafeMarkup::format('I really need those tags and I am too lazy to write a template, however I know it is safe.');
OR an inline template.
2. Yes, I am still uncomfortable with that as well - but there was lots of discussions and a test added already, given we push back the data via a response, we should as well be able to SafeMarkup:: set() before calling renderRoot() - the end result is the same, the markup is pushed verbatim to the user.
3. Can be fixed :)
4. Yeah, should be improved. I think for the 99% case #markup is not needed. If you need '' or '
' need to do some work :). 5. Oh, nice find :).Comment #261
lauriii#259.1: Isn't it enough that it is possible to set markup as safe in the parts where someone wants to use non-whitelisted html element? That is not the right way because templates should be used anyway but I don't think there is better way.
Comment #262
Wim Leers#260.1/#261: thanks, that makes sense.
#261 changed #259.2, but #260.2 suggests that this is intentional. So… I'm confused now about what it is that we want here.
Comment #264
Fabianx CreditAttribution: Fabianx as a volunteer and commentedFor tests to pass need:
$output['#markup'] = SafeMarkup::set($output['#markup']);
before this.
Comment #265
lauriiiComment #267
lauriiiNow this should pass
Comment #268
xjmFixing a few grammar things.
Comment #269
xjmAttached cleans up a few comments. Mostly just grammar polish except for this:
So, this is subtle, but I don't think we should ever say we use
SafeMarkup::set()
to "avoid [text] being escaped again" as that is exactly what we don't want other code to do. I've rewritten the comment in the attached patch to clarify.Comment #271
xjmI rolled my eyes at myself and cancelled the test run for my interdiff. However, I was going to set this to NW anyway for this:
Hmm, I'm not keen on adding another stray
SafeMarkup::set()
call in Views, nor another one so far away from where the markup is composed. How do we know this markup is safe? Looking in therender()
method on this class, the output ischeckPlain()
ed stuff wrapped in a<pre>
tag if it's a live preview, but what about when it's actually being rendered?At a minimum, we need to document why the call here is safe, but it'd be better to find a way to remove it. (That's the parent critical, after all.)
Comment #272
dawehner... in other words, we are not dealing with HTML nor with twig templates. In this case we have already constructed the entire HTML, so why care about it? I totally agree that we need some explanation here.
Comment #273
Fabianx CreditAttribution: Fabianx as a volunteer and commented#271:
What we before was doing was:
By doing so we implicitly trusted #markup and send it to the browser. (that is similar to marking it as safe in effect).
Therefore the explicit SafeMarkup::set() is way better already. (as we make explicit, which we implicitly assumed before).
Also renderRoot() will in the near future return a cacheable render array, so not calling the renderer - even though it is markup is wrong in several ways.
Comment #274
xjmBut why would we do this for a response that isn't even being handled for Twig?
Here was the test failure this was intended to fix:
https://qa.drupal.org/pifr/test/1057283
Which... I'm worried that the closing tag for that paragraph is escaped but not the first one; what's going on with that? Also, we should figure out if this is being escaped in the parent environment or the Views preview or what, because I still feel like this
SafeMarkup::set()
should not be added.Comment #275
Fabianx CreditAttribution: Fabianx as a volunteer and commented#274 This is likely due to how Xss::filterAdmin() works.
Twig is not involved, but the renderRoot() sees #markup, sees its not marked as safe and runs Xss::filterAdmin() on it.
Comment #276
xjmComment #277
dawehnerI was honestly NEVER a fan of abusing the render API here. We already set #markup below, no #attachements will be bubbled etc.
#2381277: Make Views use render caching and remove Views' own "output caching" thought will use the renderRoot() call in order to use render caching here.
Given that explanation, a
SafeMarkup::set()
calls seems to be the semantically right thing to do, we know its safe output, and not intended to be used for the browser,or at least not intended to be used as HTML.
Comment #278
xjmSo the issue summary says:
Did that get reverted or removed from the patch? I don't see any changes in the patch that seem to be doing that. Is that was reverted based on #260 and following? And that sounds like it would have bearing on the REST plugin output issue.
A question for @dawehner:
Abusing the render API in what sense? I'm not sure if this means the one
renderRoot()
call in the Views plugin or the entire approach for filtering#markup
in this patch. :) IfSafeMarkup::set()
is the correct approach with the current state of HEAD for this patch, is there a followup we could file, something posptoned on #2381277: Make Views use render caching and remove Views' own "output caching", that would get rid of it and address your concern here?In general it seems weird to me that
SafeMarkup
should have any bearing on output that never even is going to see a Twig template. (I also worried for a moment about the memory overhead being added to REST calls. OTOH any translated string is already going to be added to the SafeMarkup list anyway, so that's hardly new here.)At a minimum though we still for sure need a big long comment explaining it (and referencing a relevant followup if there is one). Updated the remaining tasks to clarify that.
Comment #279
dawehnerIf you look at
\Drupal\rest\Plugin\views\display\RestExport::render
and\Drupal\rest\Plugin\views\display\RestExport::execute
basically everything which is done is the following:
Well you know, we can just add it, but add a todo
Comment #280
xjmSo I discussed this at length with @WimLeers and @dawehner.
Response
rather than aCacheableResponse
, it'd only get invalidated on cache expiration rather than on any changes to the data in the view (violating the expectations of D8 cacheability and generally not behaving like a view should for its cache tags).#markup
element to be, well, markup, and sanitizing it as such.SafeMarkup::set()
there: It's "safe" markup in the sense that it's not "markup" (except in the case of the Views preview). Since there's no markup, there's no markup to inject XSS into. :P (The preview ischeckPlain()
ed but I think that's irrelevant for theSafeMarkup::set()
call because I thinkDisplayPluginInterface::execute()
andDisplayPluginInterface::preview()
are mutually exclusive.)So @dawehner and I agreed that for now, a
SafeMarkup::set()
call is acceptable for this issue to skip the filtering indoRender()
, and we should add it with an @todo linking the followup issue mentioned above.Phew!
Edit: reformatting for more readability.
Comment #281
larowlanWe're not double rendering rest exports anymore, so removed that line
Comment #282
dawehnerThank you xjm to write things down properly!
So we now need someone documenting the new call, due to the great writeup in #280
Comment #283
xjmJust adding that to the remaining tasks. Thanks @larowlan and @dawehner.
Comment #284
xjmComment #285
lokapujyaOpened: #2501313: Discuss how to support non-HTML output in the render API
Comment #286
lauriiiComment #287
lokapujyaComment #288
joelpittetThis should be
@
instead of!
since it's dealing with markup.We've been trying to re-roll this but many failures on RendererPlaceholdersTest are holding us back a bit. Most of which came from this change it seems #2478483: Introduce placeholders (#lazy_builder) to replace #post_render_cache
Comment #289
xjmRe: #288 and
@
vs.!
, what if someone wants to use the API to render a<script>
tag? The function already carries a warning that it is not sanitized. (This goes back to the discussion about reverting that change to!
.)Comment #290
joelpittet@xjm re #289 from that other issue we can now do the same as
@
with!
. The only real difference is that if the variable is unsafe going in will change the output of the whole string. If they both have safe strings going in they will produce the exact same output.So in general you should use @ always. May I misinterpreted your question? Do you have some example code?
#2399037-41: String::format() marks a resulting string as safe even when passed an unsafe passthrough argument
Comment #291
lokapujyaRerolled, but this will have a lot of fails. Somehow, the placeholders patch is interfering with XSS escaping. Possibly seeing the errors will help figure out what is conflicting.
Comment #292
joelpittetHere is that change I mentioned in #288 against #291 Should fail the same from that lazy_builder stuff.
Comment #293
xjm@joelpittet, yeah, my previous comment in #289 was kinda nonsense. :P Sorry about that. Obviously there's no sane usecase to put a
<script>
tag inside a<noscript>
. I think I will sleep before trying to remember or explain what it was I was thinking. :)Comment #296
larowlanSo it looks like drupal_set_message is broken in the tests, digging in
Comment #297
larowlanthe placeholder markup is always safe
Comment #299
lauriiiComment #300
joelpittetThank you @lauriii and @larowlan for resolving that.
@lokapujya and I are reviewing this in depth today at the sprint.
I've added a test to make sure we know we are getting a simple strong tag in
#markup
still because we noticed there is no other checks for tags in#markup
.Comment #301
xjmThanks @lauriii and @larowlan.
My concern about this is that we're not only marking this placeholder as safe for generation here; we're also saying it's something someone can enter somewhere else and not get sanitized. And possibly even cause whatever the placeholder turns into to be rendered in their field!? I feel like a correct fix would be to change how the render API assembles the placeholders into the string.
In general, when we added the
SafeMarkup
API, we said the only way it could really work would be if the safe markup list did not get individual tags added to it, and I feel like the placeholder is similar to that.We might be able to move that to a followup (since the placeholder itself is known safe and it would unblock other work here), but I'm afraid it would also be a critical followup.
I thought about it a bit and am thinking adding
SafeMarkup::set()
to the unit test is okay if we add the other call in the Renderer, because (a) it's a unit test and (b) it mimics an API call in the thing it's testing. It needs a comment documenting the use though, and if we end up creating a followup issue, a todo linking to it just like the main call in the Renderer.Nitpick: "non-XSS" (with a hyphen). :) +1 for this test!
I'll ping a couple people for more feedback on #1.
Comment #302
joelpittetThis may be a bit clearer on how it's safe than that comment was showing and we don't have to write SafeMarkup::set() here.
Fixes 301.1 and 301.3
Comment #303
joelpittetSo that was weird... I some how posted the files not to the comment... try again.
Comment #304
lokapujyaRemoving another SafeMarkup::Set().
Comment #305
lauriiiFor #302.2
Comment #306
xjmAwesome work on those!
Are we good to revert these now too?
Comment #307
effulgentsia CreditAttribution: effulgentsia at Acquia commentedThis patch looks great! Awesome work! #306 can be resolved by replacing with a SafeMarkup::format() inside the $generate_placeholder_markup() function. Also:
Needs a comment (with some of the info from #280.8) and link to #2501313: Discuss how to support non-HTML output in the render system.
Other than that, this looks ready to me.
Comment #308
larowlanComment #309
larowlanFixes 307, think we're close here
Comment #312
larowlanFollowing feedback/discussion on IRC with @xjm
Comment #313
larowlanUpdating remaining tasks to
None
to reflect the fact that we're back to needs review or possibly rtbc again.Comment #314
xjmAlso all the followups are filed. :)
Comment #315
effulgentsia CreditAttribution: effulgentsia at Acquia commentedHow about this doc change? If this isn't ok, I'm ok with #312 committed instead and us working it out in the followup.
Comment #316
joelpittetI like this but isn't it
text/html
?Comment #317
larowlanNo, the request format will extract the format from the mime-type - e.g. this data provider for the unit test
Comment #318
effulgentsia CreditAttribution: effulgentsia at Acquia commentedSince #315 is a small change, and given #316 and #317, I think I'm eligible to RTBC, so doing so. Assigning to @xjm for commit so she can verify all her concerns have been addressed.
Comment #319
dawehnerI like the doc change in #315!
Comment #320
xjmI reviewed this ONE LAST TIME and everything is looking really great. Some specific notes:
This conditional statement is more reliable and the documentation is excellent!
+1 for adding this assertion; it's always good to provide one like this before a negative assertion. (The assertTrue() following is a negative assertion because it's asserting true that it's exactly false. assertFalse() would be incorrect because that would match 0 for strpos().)
I had to squint at this assertion a second because 1. It's asserting that it's true that something is exactly false. 2. The docs for
getRawContent()
are on the less helpful side.I think a fix is to make the assertion message a little more specific and thus the assertion more self-documenting. Done in attached.
I'm going to reroll to improve this slightly. The casing in the class name is wrong, we can link directly to the list of allowed tags, and clarify a little more. Attached.
I'm still comfortable with committing this since my updates here and back in #269 were small docs updates, but I would like a teeny review or signoff on my change here first. Leaving at RTBC though to emphasize that I think it's still RTBC either way...
Comment #321
xjmaugh missing space. THAT I'll fix on commit. ;)
Comment #322
joelpittetA bit better comments thanks @xjm, I'm signing off save for the the space you can fix on commit:)
+1 better description.
needs a space before
<style>
.Comment #323
star-szrI just gave this a once-over as well. Looks great, thanks all!
Comment #325
xjmAlright, committed and pushed to 8.0.x!
https://twitter.com/xjmdrupal/status/607620233671659520
Thanks everyone for your sharp eyes, sharp thinking, and perseverance on this issue.
Comment #326
willzyx CreditAttribution: willzyx commented+1 on doing this but currently are there alternatives for insert markup with css/js through the render api? variables pretty print in Devel (Krumo, Kint) is broken (see #2503591: (entity)/edit/devel pages are broken) because of this and the suggested alternatives in the CR are a little vague and do not offer a valid solution:
The only solution seems to be use SafeMarkup::set() for printing output generated from third part debug library (Krumo,Kint etc).
Any suggestion?
Comment #327
effulgentsia CreditAttribution: effulgentsia at Acquia commentedPersonally, I think calling SafeMarkup::set() on the return value from a library such as Krumo is legitimate. Especially if you add a comment on why you believe it to be safe (e.g., that it's not including any user-entered strings, or if it is, then they've been properly filtered). Curious if anyone here has other suggestions.
Comment #328
effulgentsia CreditAttribution: effulgentsia at Acquia commentedYou can also do:
Which would have the benefit of not directly affecting the SafeMarkup registry, thereby benefiting from future optimizations such as #2503447: Use Twig_Markup to help with the SafeMarkup::$safeStrings memory load from Renderer::doRender().
You can also use the
|raw
filter in a larger template (not just a dedicated inline one as above), but be very careful with that, since any template that uses the raw filter needs to be absolutely certain that whatever is passed in doesn't contain an XSS attack.Comment #329
xjmFor #326 to #328 let's file a followup issue to discuss it further since this issue is so long already. Also see #2280965: [meta] Remove every SafeMarkup::set() call and #2494297: [no patch] Consolidate change records relating to safe markup and filtering/escaping to ensure cross references exist. Thanks!
Comment #330
Wim LeersI think I'm reading that the concern is that whatever markup the placeholder is replaced with, that that replacing markup is not guaranteed to be safe? That indeed used to be the responsibility of the callback that replaced the placeholder. But as of #2478483: Introduce placeholders (#lazy_builder) to replace #post_render_cache, that is actually guaranteed to be safe, since replacing the placeholder is always done in the same standardized way that goes through the Render API. See
\Drupal\Core\Render\Renderer::replacePlaceholders()
.Comment #331
pwolanin CreditAttribution: pwolanin at Acquia commentedFor something like the Krumo markup - is that a use case for wrapping it in Twig_Markup?
Comment #333
xjmComment #334
alimac CreditAttribution: alimac at University of Illinois at Chicago commentedFollow-ups filed from this issue:
Since this issue is fixed, and all follow-ups have been opened, then we can move
core/lib/Drupal/Core/Render/Renderer.php
#2273925: Ensure #markup is XSS escaped in Renderer::doRender() only covers one use and will need a followup.
#2506581: Remove SafeMarkup::set() from Renderer::doRender
to the Done section in #2280965: [meta] Remove every SafeMarkup::set() call