Follow-up to #2297703: [meta] Refactor and remove as many SafeMarkup::set() calls as possible
Remove SafeMarkup::set() from title, it's not needed with strip_tags() and if anything it makes things worse and unsafe if strip_tags fails to safely remove broken tags.
$head_title = array(
'title' => SafeMarkup::set(trim(strip_tags($page->getTitle()))),
'name' => String::checkPlain($site_config->get('name')),
);
Beta phase evaluation
Issue category | Task because it does not fix any bugs or introduce any new features |
---|---|
Issue priority | Major because the current implementation incorrectly implements Safemarkup |
Prioritized changes | The main goal of this issue is security and code cleanup. |
Disruption | Not disruptive |
Comment | File | Size | Author |
---|---|---|---|
#67 | 2369987-2.67.patch | 6.39 KB | alexpott |
#67 | 52-67-interdiff.txt | 1.92 KB | alexpott |
#63 | 2369987.63.patch | 5.67 KB | alexpott |
#63 | 52-63-interdiff.txt | 1.2 KB | alexpott |
#52 | 2369987-52-safe-markup-set.patch | 4.47 KB | joelpittet |
Comments
Comment #1
joelpittetComment #2
joelpittetActivate!
Comment #4
lauriiiNot sure if its possible to use SafeMarkup::escape for title because its still possible that theres gonna be some special characters and doesnt support UTF8 characters.
Comment #5
joelpittet@lauriii thanks, I meant to postpone this on the #2352155: Remove HtmlFragment/HtmlPage because it will conflict.
Comment #6
Wim Leers#2352155: Remove HtmlFragment/HtmlPage landed, this can now continue.
Comment #7
pgautam CreditAttribution: pgautam commentedPatch.
Comment #9
aneek CreditAttribution: aneek commentedThe tests fail due to twig double escape in title tags.
The function should generate
Edit Basic page <script>alert("xss")</script>csF2Z9AM | Drupal
while generated isEdit Basic page &lt;script&gt;alert(&quot;xss&quot;)&lt;/script&gt;pNdNdJSz | Drupal
.Comment #10
aneek CreditAttribution: aneek commentedCreated a patch to fix the double escaping. But I've used "renderer" service and
#markup
to print the page title. Also titles are run throughSafeMarkup::escape()
in this patch. But based on @lauriii's comment #4, I have also added a special character test case to check a node with special characters in the title. So far it worked.But I think there can be a more better solution rather than using
#markup
, which itself runsSafeMarkup::set()
@ #2273925: Ensure #markup is XSS escaped in Renderer::doRender().Comment #11
aneek CreditAttribution: aneek commentedComment #13
aneek CreditAttribution: aneek commentedPatch #10 introduces a security hole in titles for XSS. Need to change the logic here may be. :-(
Comment #14
aneek CreditAttribution: aneek commentedWith a different approach. Let's see if bot returns green :-)
Comment #15
aneek CreditAttribution: aneek commentedComment #18
aneek CreditAttribution: aneek commentedOkay now. Bot returned Green in second attempt. Anyway, review please. Do you think this could be the right approach to address this issue?
Thanks!
Comment #19
idebr CreditAttribution: idebr commented@aneek The inline template approach looks very clean! It seems the test from #10 got lost along the way, or was this intentional?
Comment #20
joelpittetLove it, I think this is a great use of inline_template.
Is this the new drupal_render()?
Comment #21
idebr CreditAttribution: idebr commented@joelpittet drupal_render() has been marked deprecated and currently is just a wrapper around the renderer service:
Comment #22
joelpittetI feel like I half knew this... but that is cool regardless, thanks @idebr. Swappable now:) So I guess we'll be doing that call directly now off the service.
Comment #23
tstoecklerIs the explicit call to the renderer needed at all? AFAIK Twig checks whether a thing is a render array before printing it and calls the renderer itself. That would also allow for alterability in other preprocess functions.
Comment #24
joelpittet@tstoeckler Good call, unless we need that for some stringy stuff later, we can leave it as a renderable array and avoid the early call to drupal_render()/renderer::render()
Comment #25
aneek CreditAttribution: aneek commented@joelpittet & @idebr thanks for reviews. Joel, I think that we do need the renderer call. But I will check with keeping the inline_template call as array. I think it will print "Array" instead of printing the title.
But today I'll have a look.
@idebr, yes, I removed the test, it was just to ensure that special characters are properly printed with usage of SafeMarkup::escape calls.
Comment #26
aneek CreditAttribution: aneek commentedBased on @tstoeckler, comment created a new patch and working fine I think. (PageTitleTest.php gave me some warning). Lets see what the test bot has to offer.
Thanks!
Comment #27
aneek CreditAttribution: aneek commentedComment #28
aneek CreditAttribution: aneek commentedOkay bot returned Green. Any more suggestions? I think this type of fix can be adopted to fix other
SafeMarkup::set()
removals as well. What do you guys think??Thanks!
Comment #29
idebr CreditAttribution: idebr commentedThanks Aneek, I did a manual test to confirm the inline template properly escapes the title exactly once:
I have updated the issue summary to include a beta evaluation.
Comment #30
alexpottCommitted 45268c1 and pushed to 8.0.x. Thanks!
Thanks for adding the beta evaluation to the issue summary.
Comment #33
alexpottReverted this as this causes #2424743: Random testbot fail: "The page does not have double escaped HTML tags." in Drupal\field_ui\Tests\ManageFieldsTest. @aneek in #18 it would have been good if you could have documented why the patch failed the first time.
Comment #34
aneek CreditAttribution: aneek commented@alexpott, thanks. My mistake sorry :-(. I'll have a look why this happened. Sending #26 patch for re-test.
Edit #1
I created an article with title "&1e5&QE<" with the patch applied, everything seems to be fine. But if and only if I add a new field with the name "&1e5&QE<" then the title comes as double escaped. The value setting in the inline_template array is below,
And in the twig layer it's generating
&amp;1e5&amp;QE&lt; settings for Article | D8
So somehow
&
is double escaped to&
. We do need a way to tell twig that the string is previously escaped from the Drupal or may fix this in twig level?Edit #2
@alexpott, I don't think the patch failed due to this code changes in #18. I've ran a re-test and again it passed. Not totally clear to me. :-(
Any ideas? Thanks!
Comment #36
aneek CreditAttribution: aneek commentedWhile checking
template_preprocess_html()
in about line #1333 I had a debug statement of printing the title generated (trim(strip_tags($variables['page']['#title']))
).So this gives the following,
bool(true)
while viewing the node.bool(false)
while in the edit page.bool(false)
while in the field edit page.Any ideas why is that?
Comment #37
idebr CreditAttribution: idebr commented@aneek NodeViewController::title() returns a string that is already checked with String::checkPlain():
Comment #38
aneek CreditAttribution: aneek commented@idebr, yes correct. But if you check
public function form()
in/core/modules/node/src/NodeForm.php
then you will find,The node title and the content type name is check plain'd but this returns
bool(false)
if it's checked viaSafeMarkup::isSafe()
.Comment #39
aneek CreditAttribution: aneek commented@alexpott, I debugged this a bit. Here are my findings.
Given node title: &1e5&QE<
In node edit form(
public function form()
in/core/modules/node/src/NodeForm.php
)the code to generate the title is,
While this is processed, this goes to the
template_preprocess_html()
function to generate the page title. Here the string is marked as safe inSafeMarkup::set(trim(strip_tags($variables['page']['#title'])))
.In the
template_preprocess_html()
initially the RAW HTML title looks likeEdit Article &1e5&QE<
but if we run this viaSafeMarkup::isSafe()
it returns FALSE. This should return TRUE.I guess this is why
SafeMarkup::set()
code was used to mark the title string safe.Now, if we remove the
SafeMarkup::set()
call as below,while the string is passed to
twig_drupal_escape_filter()
method then inside the condition at line #246,the double escaping is happening. So in
$autoescape && SafeMarkup::isSafe($return, $strategy)
,$autoescape
returns TRUE and the other conditionSafeMarkup::isSafe($return, $strategy)
returns FALSE. So the string is again going viaString::checkPlain()
.Isn't it that a string like,
A <em>safe</em> markup string.
will return TRUE if passed via SafeMarkup::isSafe() but it's not in this case of the page titles. Actually the HTML tags are getting removed by strip_tags(). May be this is the cause?Please let me know your thoughts. Am I missing something?
Comment #40
aneek CreditAttribution: aneek commentedA new patch. Added
SafeMarkup::checkAdminXss()
, this will check if the title is safe or not and apply filter admin to it. Since the code sequence is like,no tags will be present for checkAdminXss to remove due to strip_tags function. So nothing will be ever removed from the title and thus the title will also be marked as safe. So there will be no double escape.
If someone feels that this needs a more efficient approach. Please let me know. And Does it needs any testing?
Please review!!
Comment #41
aneek CreditAttribution: aneek commentedComment #42
aneek CreditAttribution: aneek commentedComment #44
rteijeiro CreditAttribution: rteijeiro commentedFixed coding standards for arrays.
Comment #45
ianthomas_uk@rteijeiro Thanks for trying, but Drupal coding standards are to use the array() syntax, so the old patch was actually correct. See https://www.drupal.org/coding-standards#array . You've also included several unrelated changes (note how your patch is 21KB, but the previous patch is only 2KB).
Reviewers, please look at #41.
Comment #46
Mile23Needs reroll...
Comment #47
rteijeiro CreditAttribution: rteijeiro commentedRe-rolled!
Comment #49
jain_deepak CreditAttribution: jain_deepak commentedRerolled
Comment #50
joelpittetThanks for the rerolls @rteijeiro and @jain_deepak.
These lines sneaked in the patch and are unrelated.
This is a clever approach, though this is completely markup/theming specific I'd rather move that directly into the template instead of fussing about with it in the preprocess.
Comment #51
joelpittetI'm going to attempt a couple thing here, bare with me. But FYI there is no need to use SafeMarkup on HTML titles, because we can't put HTML in them anyway.
http://www.w3.org/TR/html401/struct/global.html#h-7.4.2
Comment #52
joelpittetOk going to be a bit bold here. Removes the check_plain and admin_xss stuff and let's the template deal with it as it comes(because it already escapes on print).
I deprecated
$variables['head_title_array']
because it's nicer if we just use{{ head_title|join(' • ') }}
or{{ head_title.slogan }}
and control over that presentation in the markup for TX.Comment #53
akalata CreditAttribution: akalata commentedThe comment for head_title in html.html.twig says "List of text elements...", but is that still the case? Looks like it was changed to a string.
I'm not knowledgeable enough to know if your nuking of the xss stuff is okay. :)
Comment #54
joelpittet@akalata yes it is a list, that's why it needs safe_join filter to joint the array elements.
Comment #55
joelpittetComment #56
joelpittet@akalata to be a bit more clear on what my intention is. I made head_title into the array. But treating it like a hash so it's not really a list of indexed elements and that's why you can do
{{ head_title.slogan }}
which I think reads better as not plural.Does that sound sane?
Comment #57
lauriiiI think the change is clean and fixes the issue. There is also beta evaluation in the issue summary.
Comment #58
alexpottCommitted 10c8777 and pushed to 8.0.x. Thanks!
Thanks for adding the beta evaluation to the issue summary.
Comment #61
alexpottCore just failed for:
I ran
Drupal\field_ui\Tests\ManageFieldsTest->fieldUIAddExistingField()
40 times before committing... seems like we didn't fix the random fail.Comment #62
alexpottI've run the test locally with the patch applied over 100 times and still no fails :(
Comment #63
alexpottFound it - thanks @joelpittet.
Patch attached fixes the random fail by changing the default label generating method to
randomManchineName
. If someone wants to test field labels with HTML entities in the label then this should be tested explicitly.Comment #64
catchGood find, back to RTBC.
Comment #65
Fabianx CreditAttribution: Fabianx as a volunteer commentedI opened #2487498: Make randomString always return a > to avoid random test fails as a follow-up to make random test failures a thing of the past (mostly) - at least for double escape issues.
Comment #66
alexpottHang on I have a better fix. Patch coming.
Comment #67
alexpottNew fix.
Comment #68
joelpittetNice, that should close off the follow-up too I think... @Fabianx or would you still like to explore that? May not be a bad idea to always do that so we can avoid other possible random test failures as they are tricky to track down.
This patch should be gold though;)
Comment #69
catchCommitted/pushed to 8.0.x, thanks!
Comment #71
xjm