Follow-up to: #1825952: Turn on twig autoescape by default
Problem/Motivation
In #1825952: Turn on twig autoescape by default, \Drupal\Core\Render\Element\HtmlTag::preRenderHtmlTag() assembles HTML markup based on input and it is subsequently marked as safe. However, there is no guarantee that the rendered markup is actually safe.
This is not a security regression because the same is true in HEAD; HtmlTag::preRenderHtmlTag()
will render whatever the caller tells it to, so it's the caller's responsibility to sanitize the input. However, this is one of the only places that we are marking as safe markup strings that are not explicitly known to be safe.
Proposed resolution
Remaining tasks
- (done) Evaluate whether the string can be refactored to one of the formats outlined in this change record: https://www.drupal.org/node/2311123
- (done) Identify whether there is existing automated test coverage for the sanitization of the string. If there is, list the test in the issue summary. If there isn't, add an automated test for it.
If the string cannot be refactored, the SafeMarkup::set() usage needs to be thoroughly audited and documented.
See #47
Manual testing steps
Do these steps both with HEAD and with the patch applied:
- Install Drupal 8.
- Log into Drupal.
- Review the HTML of the
<script>
code block at the bottom of any page. This code block has inline javascript beginning with "var drupalSettings =". - Compare the output above in HEAD and with the patch applied. Confirm that the HTML is identical.
User interface changes
NA
API changes
NA
Comment | File | Size | Author |
---|---|---|---|
#73 | remove-2296101-73.patch | 6.38 KB | josephdpurcell |
#73 | interdiff-remove-2296101-70-73.txt | 842 bytes | josephdpurcell |
#70 | remove-2296101-70.patch | 6.38 KB | josephdpurcell |
#70 | interdiff-remove-2296101-68-70.txt | 498 bytes | josephdpurcell |
#68 | remove-2296101-68.patch | 6.62 KB | josephdpurcell |
Comments
Comment #1
xjmComment #2
xjmComment #3
jibranComment #4
dawehnerThe amount of spaces where we still use
#type => 'html_tag'
is really limited. I think we should try to maybe get rid of itor at least move every place in core into a template. (maybe 5 non-test cases)
Comment #5
Anonymous (not verified) CreditAttribution: Anonymous commenteddrupal_pre_render_html_tag
got renamed/removed, but my google skills and I can't figure out where it went. Could this be updated in the title?Re #4: I can't find
#type => 'html_tag'
either.Can the summary be updated, or can this issue be closed?
Comment #6
star-szr@pjonckiere - this is where I love
git log -S drupal_pre_render_html_tag
:) updated the title and issue summary. Thanks!Comment #7
catchI'd be fine with removing
#type => 'html_tag'
and then not having to solve this.We originally added that to allow themes to switch between XHTML and HTML5 easily, but 8.x (and everything else) is firmly HTML5 at this point, so there's really no reason for it at all.
Also bumping to major since the more of these we fix, the smaller #2297703: [meta] Refactor and remove as many SafeMarkup::set() calls as possible becomes.
Comment #8
xjm#2273925: Ensure #markup is XSS escaped in Renderer::doRender() is removing one of the uses in this method, but there are others.
Comment #9
joelpittetI'm curious if this will fly.
Comment #11
aneek CreditAttribution: aneek commented@joelpittet, this totally breaks the application. Tested manually once. PFA the screenshot. I wonder that the usage of "@" in the tags might making double escaping. Though I haven't debugged thoroughly.
But this one I have checked,
Using patch #9, while loading a tag of type script is double escaped. So in the body footer the JavaScript generated is double escaped like,
Where it should be,
As one of the solution we have to find an way to deal with the script type of tags, where
$element[#tag] == 'script'
.Comment #12
joelpittetThanks for the manual review @aneek. Since we are using SafeMarkup::checkAdminXss() on prefix/suffix and of late #markup. We can do the same here for value and it's prefix/suffix.
Let's see how this blends...
Comment #13
joelpittetNice green! @aneek mind giving that a try on for size?
Comment #14
aneek CreditAttribution: aneek as a volunteer commented@joelpittet, I might. Okay tonight then.
Comment #15
aneek CreditAttribution: aneek as a volunteer commented@joelpittet, so I only removed two variable initializations
$value_prefix
&$value_suffix
. Do we require those? Those are not used anywhere else, so might save very very little memory on this.I hope this is acceptable if the result is green. :-)
Comment #16
joelpittetOh good call thanks @aneek it also removes the function calls to the checkAdminXss if not needed!
Comment #17
aneek CreditAttribution: aneek as a volunteer commented@joelpittet, most welcome :-)
Thanks!
Comment #18
seantwalshLooked over the patch on the Shortcuts page as in #11. While visually the patch in #15 appears fine and it passes test, the data in the
tag at the bottom of the page is still double escaped. Before: After:Comment #19
seantwalshIf I change SafeMarkup::checkAdminXss to SafeMarkup::format it renders correctly. The cause seems to be that SafeMarkup::checkAdminXss does not allow the CDATA markup that should render in this case.
Comment #20
akalata CreditAttribution: akalata as a volunteer commentedDid you make any other changes besides just the method name? Simply switching the method names would actually make this less secure, because you'd be passing $element['#value'] as a literal string.
However, calling
SafeMarkup::format('<!--//--><![CDATA[//><!-- @element //--><!]]>', array(@element => $element['#value']))
seems like it could work?Comment #21
seantwalsh@akalata and I spoke about this IRL, attempting to summarize that discussion and potential next steps.
Based on looking at what the patch in #15 does, it seems that either:
Comment #22
akalata CreditAttribution: akalata as a volunteer commented1. For easier evaluation, here's the full comment for
preRenderHtmlTag()
, which is what led us to question the purpose of this issue:2. Clarification: If we do decide to include some sanitization here, I think that only the
value_prefix
andvalue_suffix
should be flagged using SafeMarkup::format (accepting these as provided by the caller), while potentially continuing to have$element['#value']
run through a filter? This seems to be a gray area we're running into while addressing these conversions.Comment #23
davidhernandezComment #24
joelpittet@akalata re #22.2. Yeah it's a bit grey for sure. I was going on the side of caution for
#value
but still permissive to an extent. But maybe you are right let's just let it escape if it's not safe. That's probably a good call.Comment #25
joelpittet@akalata re #22.1 @chx and I put that in there because we didn't know what to do with it at the time during the initial auto-escape conversion;) I recall this conversion explicitly and it resulted in throwing arms in the air and saying "we can't prevent people from doing stupid things entirely or DX will suffer" Paraphrasing from what I remember
Comment #26
alexpottI've done some investigation as to whether we can remove the SafeMarkup::set() calls completely from this file and I think we can! But we need both #2506581: Remove SafeMarkup::set() from Renderer::doRender and #2506195: Remove SafeMarkup::set() from Xss::filter().
If we do that then we can make the html_tag element have way less side effects and completely remove the possibility of it marking something unintentionally as safe. Therefore I'm going to postpone this issue on them
Comment #27
alexpottNow doing the real postponing as per #26 :)
Comment #28
alexpottAll the blocking issues have landed
Comment #29
alexpottAs this is definitely part of the render system I think it is appropriate to auto-escape and auto-filter if things are marked as safe and use SafeString::create here. That way no SafeMarkup::set() is used.
No interdiff because a completely different approach.
Comment #30
YesCT CreditAttribution: YesCT commentedI'm going to review this.
Comment #31
YesCT CreditAttribution: YesCT commentedI'm not understanding how user input gets into the drupalSettings in the script tag at the end of the pages.
What are we worried about that might not be safe?
Comment #32
YesCT CreditAttribution: YesCT commentedin head
with the patch:
looks like it is missing the prefix and suffix.
I'm going to take a look at that now.
Comment #33
davidhernandezI'm not sure what we're considering acceptable or not. This is going to filter out the CDATA prefix.
Comment #34
davidhernandezWhat else can actually make a prefix or suffix? For the most part this seems to only be used for script tags now?
Comment #35
YesCT CreditAttribution: YesCT commentedso, it seems to me that we either
a)
dont check if prefixes and suffixes are safe, and assume they are,
or
b)
to use the new
protected static function xssFilterAdminIfUnsafe($string) {
we go to where those are set and mark them safe (and document why) there.
this starts doing b) (but still needs the comments)
[edit]
ag CDATA core --ignore="core/vendor" --ignore="test" --ignore="*Test.php" --ignore="*.js"
was how I started looking for places to mark safe at the source of the prefix and suffixthe change in
core/lib/Drupal/Core/Asset/JsCollectionRenderer.php
can be seen like the previous manual reviews here on just about any page (like the shortcuts page)
in the script tag at the bottom of the html source file.
the change in
core/lib/Drupal/Component/Utility/Html.php
I'm not sure where to see, maybe by making drupal output xhtml?
uploading code mostly to demonstrate what I'm talking about.
Comment #36
YesCT CreditAttribution: YesCT commentedto help in thinking about @catch's suggestion in #7
ag "\'html_tag" core
is helpful.
Comment #37
dsnopekAnother option (which we discussed at DrupalCorn) is just not using
#value_prefix
and#value_suffix
in core for making CSS and Javascript compatible with XHTML, because Drupal 8 is all HTML5, but leaving those as valid arguments for the theme function just in case someone in contrib wanted to create an XHTML theme.Comment #38
akalata CreditAttribution: akalata as a volunteer commentedIn discussing our options, I think we missed the suggestion in #4 and #7 to completely remove
#type => 'html_tag'
. This is may out-of-scope based on the original issue, but after review the non-test uses in core can be grouped into 2 categories:<head>
elements such as<meta>, <link>,
and<script>
, and simple combinations whereSafeMarkup::format
could be used instead._drupal_add_html_head()
is listed as deprecated (#2477223: Refactor _drupal_add_html_head, drupal_get_html_head, _drupal_add_html_head_link into the attachments processor, remove from common.inc.), so we may need to postpone on that. In the meantime, we could work on the 6 places where non-head-type tags are using this#type
? (One of those places is FilterCaption.php, which has its own use of SafeMarkup).Comment #39
akalata CreditAttribution: akalata as a volunteer commentedTo prevent scope creep, #2544318: Remove #type => html_tag usages has been created. I've also identified #2120113: XHTML is not a thing anymore, remove <!--//--><![CDATA[//><!-- //--><!]]> for escaping inline JS/CSS as the additional task suggested in #37.
I will be updating the issue summary and proceed with documenting as suggested in #35 - option A.
Comment #40
akalata CreditAttribution: akalata as a volunteer commentedComment #41
akalata CreditAttribution: akalata as a volunteer commentedUpdating from #29, updating #value_prefix and #value_suffix since filtering strips CDATA tags.
Comment #42
akalata CreditAttribution: akalata as a volunteer commentedComment #44
akalata CreditAttribution: akalata as a volunteer commentedAdding manual testing instructions.
Comment #45
akalata CreditAttribution: akalata as a volunteer commentedUpdating tests since we're no longer marking #value_prefix or #value_suffix as safe.
Comment #46
YesCT CreditAttribution: YesCT commentedah, we are assuming prefixes and suffixes are safe, so we are *not* filtering them. so we can not have a test that asserts script tag is filtered from prefix or suffix. ok.
but maybe we should have a test that asserts we *are* letting script through via prefix and suffex... which is weird to assert we can do something dangerous. but we are making the decision here to do it on purpose. so at least a blame on the test would link back to this issue which has our logic for doing this.
Comment #47
akalata CreditAttribution: akalata as a volunteer commentedAdding a test per #46 the ensure #value_prefix and #value_suffix are not escaped/filtered. Includes an example patch that DOES filter #value_prefix and #value_suffix, causing the test to fail.
Comment #49
akalata CreditAttribution: akalata as a volunteer commentedComment #50
akalata CreditAttribution: akalata as a volunteer commentedPer @YesCT's in-person request, posting an interdiff of what was changed in order to make the test fail.
Comment #51
YesCT CreditAttribution: YesCT commentedThat is perfect. I like the new test.
manually tested again on head and with the patch, and the html markup is exactly the same (except the hash)
prefix and the suffix have an additional warning about being assumed to be safe.
but this is less strongly worded and in a place not as noticable as the warning was before.
we should keep some kind of general warning near the top of the doc block for this method.
in comment #41 the interdiff shows this ::set() being added.
but we need to document why it is ok, or refactor this one too.
this was added in #29 and called 3 times, on the prefix, value, suffix.
Now, we are assuming prefix and suffix are save and *NOT* filtering if unsafe. (cause it broke CDATA)
so... we should think a bit about if we still want the HtmlTag::xssFilterAdminIfUnsafe()
@akalata suggests maybe we could just use filterAdmin() .. the difference would be not calling that if the value was already safe.
I can't totally wrap my head around this to have an opinion.
other than these things I think this looks really good.
Comment #52
alexpottLet's use SafeString::create() here since this is mimicking what's going on in Renderer::render().
We could just inline the isSafe() check now.
If we do keep this the SafeString::create is not necessary as the result is immediately stringified.
Comment #53
cilefen CreditAttribution: cilefen commentedComment #54
YesCT CreditAttribution: YesCT commentedthis addresses all of my concerns from #51 and alexpott's from #52.
if it is green, rtbc from me.
i did manual testing again in #51 so removing that tag.
Comment #55
YesCT CreditAttribution: YesCT commented(super small nit) one too many newlines taken out when this function was removed.
----
and... the issue summary still has a remaining task to evaluate if there is test coverage and document so in the issue summary. :/
Comment #56
cilefen CreditAttribution: cilefen commentedComment #57
davidhernandez#47 is clearly indicating test coverage. Does the last remaining task still need review?
Comment #58
josephdpurcell CreditAttribution: josephdpurcell commentedReviewing this now.
Comment #59
josephdpurcell CreditAttribution: josephdpurcell commentedI read through the patch to see if the third task is complete. I think that #tag, #value, #value_prefix, and #value_suffix are documented well.
Here are a few comments I have from review:
This use statement is out of alphabetical order. The UnitTestCase is also out of order, so either sort all of them, or just append to the end.
Change "if they are" to "if it is".
#attributes
does not need escaped here. It does not need to be escaped, because the attribute names are escaped atDrupal\Core\Template\AttributeValueBase:render()
and the attribute values are escaped atDrupal\Core\Template\AttributeValueBase:__toString()
. The class documentation forDrupal\Core\Template\Attribute
explains some of these assumptions on a high level, so perhaps simply link to that doc?I will plan to address these now.
Comment #60
josephdpurcell CreditAttribution: josephdpurcell commentedRerolling. No conflicts. Changes still pending, this is just reroll.
Comment #61
josephdpurcell CreditAttribution: josephdpurcell commentedComment #62
josephdpurcell CreditAttribution: josephdpurcell commentedAddresses #59.
Comment #63
josephdpurcell CreditAttribution: josephdpurcell commentedUpdating doc to include leading backslash on class name, see https://www.drupal.org/node/1354#classes
Comment #67
YesCT CreditAttribution: YesCT commentedthe 3-way auto-merging didn't have any conflicts... but it ended up putting in a duplicate use statement. #2544262: Refactor use of SafeMarkup::set in \Drupal\Core\Render\Element\HtmlTag::preRenderConditionalComments()
Comment #68
josephdpurcell CreditAttribution: josephdpurcell commentedDoh! Fixes duplicate use pointed out in #67.
Comment #69
YesCT CreditAttribution: YesCT commented... ack. I know it is frustrating, but the reordering is not needed as part of this issue. We can just leave it where it was (even though the use statements are out of alphabetical order... and alphabetical order isn't really a standard we have #1624564: Coding standards for "use" statements)
Comment #70
josephdpurcell CreditAttribution: josephdpurcell commentedUnalphanumericreordering per #69.
Comment #71
josephdpurcell CreditAttribution: josephdpurcell commentedComment #72
YesCT CreditAttribution: YesCT commentedI read the whole patch again. and it all looks good. just a couple nits.
html in comments should be all caps: HTML
(I verified this by:
grep -R "\/\/.* HTML " core |wc -l
which had 248
grep -R "\/\/.* html " core |wc -l
which had 19
so caps is a clear winner.)
I think "cannot" is what we mean here.
Comment #73
josephdpurcell CreditAttribution: josephdpurcell commentedUpdating the grammers from #72.
Comment #74
YesCT CreditAttribution: YesCT commentedno code changes since the last manual testing. so I think this is ready.
Comment #75
davidhernandezActually, we're changing that. I wouldn't hold up this issue either way on it though.
See #2546248: Use consistent style to mention HTML tags in code comments
Comment #76
alexpottCommitted 8071554 and pushed to 8.0.x. Thanks!