Problem/Motivation
Currently, when passed an @
placeholder, SafeMarkup::format() (and therefore t()
) calls SafeMarkup::escape()
:
case '@':
// Escaped only.
$args [$key] = static::escape($value);
break;
escape() in turn calls checkPlain()
in the case where the string is not already marked safe:
return static::isSafe($string) ? $string : static::checkPlain($string);
And checkPlain() then adds the sanitized string to the safe list:
$string = htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
static::$safeStrings [$string]['html'] = TRUE;
There are a couple of problems with this:
- It pollutes the safe list with many, many placeholders that will never be printed independently of the main string, and therefore do not need to be marked safe. See #2505701: Document SafeMarkup::set and Use htmlspecialchars() directly in Attribute() so we don't bloat the list of safe strings for why this matters -- that issue may result in significant memory savings on pages with a lot of Attribute use (which includes use of the LinkGenerator).
- It actually violates the intent of the calling code. In the calling code, what is considered as safe and intended for output is the whole string, not bits of it.
Proposed resolution
In SafeMarkup::format()
, use htmlspecialchars()
directly instead of checkPlain()
. My initial idea is to add a protected variant of escape()
that uses htmlspecialchars()
. The advantage of this would be that it would still respect the placeholder if it were already in the safe list as it does following #2435493: Change String::format()'s '@' and '%' placeholders to be auto-escaped rather than always-escaped, but it would not pollute the list with lots of meaningless escaped placeholders we don't care about.
The disadvantage would be that a placeholder that is used in two separate SafeMarkup::format()
or t()
calls would now get htmlspecialchars()
ed on each separate SafeMarkup::format()
if it were something not otherwise added to the safe list. However, I'm willing to bet that case is far rarer than using a placeholder's content only once in a request. (And, if we ever really need to optimize something to minimize htmlspecialchars()
calls, we can always add an explicit and documented SafeMarkup::set()
. But we're talking about a native PHP function.)
Beta phase evaluation
Issue category | Task because HEAD does work. |
---|---|
Issue priority | Major because this could potentially result in some significant memory savings. |
Prioritized changes | The main goal of this issue is reducing memory use with a side of minor security hardening. |
Disruption | May add double-escaping bugs with % and ! used for different placeholders in the same string, because of a behavior introduced by #2399037: String::format() marks a resulting string as safe even when passed an unsafe passthrough argument. These bugs already exist any time a ! is used with any markup whatsoever and so a separate fix is already needed, but this issue may add the double-escaping to additional strings that did not previously have them. |
Remaining tasks
TBD
User interface changes
TBD
API changes
- Possible API additions
- Change in the behavior of
t()
andSafeMarkup::format()
to no longer add placeholders to the safe list.
Comment | File | Size | Author |
---|---|---|---|
#24 | increment.txt | 649 bytes | pwolanin |
#24 | 2506035-24.patch | 4.47 KB | pwolanin |
#21 | increment.txt | 438 bytes | pwolanin |
#21 | 2506035-placeholders-21.patch | 4.47 KB | pwolanin |
#15 | interdiff-placholder.txt | 1.17 KB | xjm |
Comments
Comment #1
xjm% placeholders have a similar concern, actually. They too get added to the safe list via
placeholder()
->escape()
->checkPlain()
.Comment #2
xjmComment #3
xjmComment #4
xjmComment #5
catchYes really like this.
Comment #6
alexpottI'm kind of investigating something similar with #2506133: Replace SafeMarkup::set() in \Drupal\Core\Template\Attribute wrt to Attribute and not needed to mark things as safe.
Comment #7
alexpottComment #8
pwolanin CreditAttribution: pwolanin at Acquia commentedJust had basically the same idea this morning (before hearing about this issue - great minds think alike) - we don't need to add @ or % output to the safe list if it wasn't there already. Could be a very easy win.
Comment #9
YesCT CreditAttribution: YesCT commentedthis could have memory performance benefits.
Comment #10
catchAgreed, makes lots of sense.
Comment #11
alexpottI played with the attached patch this morning on pages with lots of strings and safe marking - eg simpletest, user permissions and module install I could see a huge difference being made by the patch. But I definitely agree that it makes sense. The double safemarkup setting in SafeMarkup::placeholder is especially silly.
Comment #12
joelpittetThis looks like a great idea! +1
Also made me realize that Twig uses ENT_SUBSTITUTE and we should probably do that to and there is an issue proposing that already here #1211866: Enable ENT_SUBSTITUTE flag in Html::escape so I'll cross reference.
Comment #14
xjmThese need more complete method docs obviously, but looks good to me in general.
Looks like some titles are getting double-escaped in the translation UI:
kylZaFlY [<em class="placeholder">French</em> translation]
This is because of a use of ! in
ContentTranslationHandler
and the (IMO extremely counterintuitive) change introduced in #2399037: String::format() marks a resulting string as safe even when passed an unsafe passthrough argument where having a!
placeholder basically guarantees your entire string will be double-escaped. We can probably fix this by changing ContentTranslationHandler to use an @.Comment #15
xjmFixed in attached. I didn't write docs yet though.
I keep meaning to file a followup for #2399037: String::format() marks a resulting string as safe even when passed an unsafe passthrough argument; I think it would be better to either revert that or remove the ! placeholder entirely rather than having it result in basically the exact opposite behavior of what people expect.
Comment #16
xjmAdded the bit about
!
to the disruption section and tagging for that followup.Comment #17
xjmMinor, but I think this should be
doPlaceholder()
rather thandoPlaceHolder()
given the public method name.Comment #18
xjmFiled #2506427: [meta] !placeholder causes strings to be escaped and makes the sanitization API harder to understand.
Comment #19
xjmComment #20
pwolanin CreditAttribution: pwolanin at Acquia commentedI don't see much value to the overhead of delegating to an internal ::doCheckPlain() method as opposed to just inlining the htmlspecialchars() call.
Let me re-roll it that way plus add the method name change you suggest.
Comment #21
pwolanin CreditAttribution: pwolanin as a volunteer and at Acquia commentedComment #22
pwolanin CreditAttribution: pwolanin as a volunteer and at Acquia commentedComment #23
xjmThe updated docs look great. I'm okay with this approach as well. Leaving at NR for more feedback though, especially given the added disruption for
!
.Minor: trailing comma instead of period.
Comment #24
pwolanin CreditAttribution: pwolanin as a volunteer and at Acquia commentedtypo fix
Comment #25
pwolanin CreditAttribution: pwolanin as a volunteer and at Acquia commented@xjm - are you thinking of a different issue? This patch doesn't change the
!
placeholder behavior.Comment #26
xjm@pwolanin, see the disruption section of the beta eval in the issue summary and #14.
Comment #29
pwolanin CreditAttribution: pwolanin as a volunteer and at Acquia commented@xjm - I'm not sure I understand where additional double-escaping could be added on top of what might already be introduced by the
!
change.Comment #30
alexpottThis will be fixed by the larger fix #2549393: Remove SafeMarkup::escape() and rely more on Twig autoescaping - @pwolanin can you comment on that issue so you can credit (@xjm has already)