Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
From https://www.drupal.org/docs/8/security/writing-secure-code-for-drupal-8 :
The Twig theme engine now auto escapes everything by default. That means that every string printed from a Twig template (e.g. anything between {{ }}) gets automatically sanitized if no filters are used.
Most of the Html:escape() calls in Search API are wrong and cause double escaping, which makes errors very hard to read.
Comment | File | Size | Author |
---|---|---|---|
#15 | 3001259-15--fix_double_escaping.patch | 20.2 KB | drunken monkey |
| |||
#15 | 3001259-15--fix_double_escaping--tests_only.patch | 4.65 KB | drunken monkey |
Comments
Comment #2
klausiPatch. I did not remove all of Html::escape because I'm not always sure of the context they are used in.
Comment #4
klausiOf course, fixed coding standards with phpcbf.
Comment #6
drunken monkeyDear Klausi,
The condescending tone is not appreciated. I am well aware of one of the biggest theming changes in Drupal 8, thank you very much. I also didn’t go around adding
Html::escape()
calls willy-nilly – that form#options
keys are not auto-escaped is one of the biggest DX WTFs I know, and something I’m pretty certain about. (I also re-tested it now, though, to be absolutely sure.) (That#description
keys are also not auto-escaped is something I must have discovered at one point, but I’d forgotten about that one again. Pretty WTF, too, if you ask me.)So, before just claiming that most of them are wrong, you might have just wanted to test it for yourself.
What might happen, though, of course, is that we already get passed a
Markup
object in some of these cases. Then, double-escaping would indeed occur, and we should guard against that. In the attached patch revision, I’m using a newUtility
method for that, which should make sure values are only escaped if they need to be.Furthermore, some of these are of course real mistakes that somehow slipped by, probably during porting. So, thanks a lot for spotting those!
In summary, also, thanks for reporting this issue – but please do try for a more polite tone in the future. (And test your patches/assumptions before posting.)
Comment #7
klausiOh, very sorry about the tone! You are doing great work :-)
I know that getting escaping right is not easy, and your approach of "better safe than sorry" totally makes sense.
The #options key situation sounds like a pretty severe Drupal core security hole, I will also test that and then report an issue to the Drupal security team.
IMO: In general, calling Html::escape() is almost always wrong in Drupal 8. Only if you are writing markup directly that bypasses auto-escaping somehow there might be a use for it. Otherwise we have a security hole in the Twig template layer that we need to report to the security team. Maybe I'm missing something?
I think your use of the Markup class is not correct here. Markup has the following docs: "This object should only be constructed with a known safe string. If there is any risk that the string contains user-entered data that has not been filtered first, it must not be used."
But you are passing user provided text into the Markup class - which is not its purpose. I will try to find out what to use instead, unfortunately the docs at https://www.drupal.org/docs/8/security/writing-secure-code-for-drupal-8 or not really helpful. As far as I can see all your rendering/theming goes through Twig templates, so everything should be escaped already?
Comment #8
klausiI just tested #options with radio buttons - the keys and the values. Everything escaped correctly, phew! So no need to report anything to the security team. #options values allow some HTML, so I assume Xss::filterAdmin() is used there. Which is ok I think, we don't care. Attached is a patch on my Drupal install that I used to quickly test it.
Any other #options vulnerability I might have missed?
Given that I think we can proceed with removing all Html::escape() calls?
Comment #10
klausiAh, you might also ask yourself why the test are failing with my patch. Turns out Drupal's auto escaping is smarter than Html::escape(). You are using
$this->assertSession()->responseContains(Html::escape($string));
where you expect^6%{[*>.<"field
. You are escaping the quote character. Auto-escaping is smart enough to know that the field name is not used in a HTML attribute context, so it does not need to escape the quote character and prints^6%{[*>.<"field
to the page.I think IntegrationTest::assertHtmlEscaped() should be removed and not used. What you can do instead is work with
$this->assertSession()->pageTextContains('^6%{[*>.<"field')
. That uses the HTML entity decoded page text as the user sees it in their browser. You are also testing auto escaping with that. If Drupal would not auto-escape correctly, then that special field name would be interpreted as HTML tag and ->pageTextContains() would not be able to find it.Comment #11
drunken monkeyThanks for the apology! Sorry if I’m a bit sensitive there.
I didn’t say it was necessarily an XSS vulnerability, but I still think that allowing HTML in, e.g., field or plugin labels in some instances doesn’t make sense. Those don’t allow HTML in all other contexts, so why there?
For this reason, I always test for unescaped HTML by just appending
. '&'
to things. I should have mentioned that before.Filtering against XSS is nice, but it doesn’t magically turn plain text into HTML. The data is plain text, so it has to be escaped – I don’t see a way around that.
It seems the XSS filter will actually auto-escape
&
characters, but leave&
alone, which I guess is usually the desired behavior. But it also makes the conversion completely inconsistent. And, see above, it’s arbitrary that it would occur there at all.IntegrationTest::assertHtmlEscaped()
is used to test for several things at once: That the value is present as it should be (escaped), and that it is neither present unescaped, nor double-escaped. (So, with better test data, we might have already spotted the problems you’re reporting. Should probably make sure that is now the case once we fix this.)The point about not escaping quotes is a good one. If we do end up with a solution that uses correct auto-escape for this, we should probably amend that function to check for that kind of escape, too. (Maybe optionally, with an additional parameter, or just always as a fallback.)
Comment #12
drunken monkey(Incidentally: Wasn’t it you yourself that created a d.o issue a few years ago complaining that
<b>
tags in issue titles weren’t properly escaped? Why is that suddenly not a problem for you anymore as long as there is no XSS vulnerability?)Comment #13
klausiField and plugin labels: those are admin provided, so they come form a highly trusted source. If the admins want to use some HTML tags in them I think we should not care. Most admins will not do that as it is confusing.
Issue titles on drupal.org: yes, but that is a IMO a very different context. In this case the source is an authenticated user on drupal.org, so any attacker can get that access easily. They can then embarrass the site (drupal.org) by using HTML tags.
Contrary to that a Search API site admin has no interest in embarrassing the site because they are the site owner. As long XSS is covered we can ignore what HTML tags are allowed.
Benefits if you ignore handling some HTML tags:
* Simpler code in Search API
* no double escaping bugs
* simplified test coverage
That is why I would recommend no additional escaping in admin UI modules such as Search API - to keep things simple.
I will test your patch now and see if the double escaping of the error message is fixed, I think the patch is a good step forward anyway here :-)
Comment #14
klausiI think my particular double escaping Solr bug was fixed by #3001220: Do not translate exception messages, so we are good to go here.
Comment #15
drunken monkeyThanks for your feedback!
You’re right, as this only affects the admin UI, it’s really not such a big problem as with issue titles.
However, as this covers just some of the places where those properties are displayed, I still don’t think allowing HTML in them makes sense. If someone would choose to actually use this functionality, putting HTML in there somewhere, it would look broken in other places on the site. The data is just, conceptually, plain text and we shouldn’t treat it as HTML. (But I know I’m a stickler for theoretical correctness there, where in practice it’s really too unimportant to make any difference.)
Looking over this again to add tests, I discovered a few other peculiarities:
@Translation
annotation). Those were the cases that were incorrectly escaped earlier. It’s a bit weird (I’d prefer not allowing any HTML in the labels), but conceptually it makes sense, as@Translation
“objects” are of course, by definition, markup.radios
#options
are only XSS-filtered,select
#options
are actually auto-escaped by Twig.MarkupInterface
objects – i.e., if you have a data type plugin with an HTML tag in its label, this will be included unescaped in the<value>
tag contents (where it is, of course, illegal, making the page’s HTML invalid). Not sure if we should fix that? We could introduce an additional second$strip_tags
parameter toUtility::escapeHtml()
for that.Utility::escapeHtml()
for consistency.Comment #17
borisson_I think these changes look very good now, I don't think we should change anything else.
Comment #19
drunken monkeyAlright. Thanks a lot for weighing in here, Joris!
So, committing this as-is. We can still discuss further changes if you’re convinced there are any problems. (Either here or in a new issue.)
In any case, thanks again, everyone!