Problem/Motivation

Currently, when passed an @ placeholder, SafeMarkup::format() (and therefore t()) calls SafeMarkup::escape():

      case '@':
        // Escaped only.
        $args [$key] = static::escape($value);
        break;

escape() in turn calls checkPlain() in the case where the string is not already marked safe:

  return static::isSafe($string) ? $string : static::checkPlain($string);

And checkPlain() then adds the sanitized string to the safe list:

  $string = htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
  static::$safeStrings [$string]['html'] = TRUE;

There are a couple of problems with this:

  1. It pollutes the safe list with many, many placeholders that will never be printed independently of the main string, and therefore do not need to be marked safe. See #2505701: Document SafeMarkup::set and Use htmlspecialchars() directly in Attribute() so we don't bloat the list of safe strings for why this matters -- that issue may result in significant memory savings on pages with a lot of Attribute use (which includes use of the LinkGenerator).
  2. It actually violates the intent of the calling code. In the calling code, what is considered as safe and intended for output is the whole string, not bits of it.

Proposed resolution

In SafeMarkup::format(), use htmlspecialchars() directly instead of checkPlain(). My initial idea is to add a protected variant of escape() that uses htmlspecialchars(). The advantage of this would be that it would still respect the placeholder if it were already in the safe list as it does following #2435493: Change String::format()'s '@' and '%' placeholders to be auto-escaped rather than always-escaped, but it would not pollute the list with lots of meaningless escaped placeholders we don't care about.

The disadvantage would be that a placeholder that is used in two separate SafeMarkup::format() or t() calls would now get htmlspecialchars()ed on each separate SafeMarkup::format() if it were something not otherwise added to the safe list. However, I'm willing to bet that case is far rarer than using a placeholder's content only once in a request. (And, if we ever really need to optimize something to minimize htmlspecialchars() calls, we can always add an explicit and documented SafeMarkup::set(). But we're talking about a native PHP function.)

Beta phase evaluation

Reference: https://www.drupal.org/core/beta-changes
Issue category Task because HEAD does work.
Issue priority Major because this could potentially result in some significant memory savings.
Prioritized changes The main goal of this issue is reducing memory use with a side of minor security hardening.
Disruption May add double-escaping bugs with % and ! used for different placeholders in the same string, because of a behavior introduced by #2399037: String::format() marks a resulting string as safe even when passed an unsafe passthrough argument. These bugs already exist any time a ! is used with any markup whatsoever and so a separate fix is already needed, but this issue may add the double-escaping to additional strings that did not previously have them.

Remaining tasks

TBD

User interface changes

TBD

API changes

  • Possible API additions
  • Change in the behavior of t() and SafeMarkup::format() to no longer add placeholders to the safe list.
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

xjm’s picture

Issue summary: View changes

% placeholders have a similar concern, actually. They too get added to the safe list via placeholder() -> escape() -> checkPlain().

xjm’s picture

Issue summary: View changes
xjm’s picture

xjm’s picture

catch’s picture

Yes really like this.

alexpott’s picture

I'm kind of investigating something similar with #2506133: Replace SafeMarkup::set() in \Drupal\Core\Template\Attribute wrt to Attribute and not needed to mark things as safe.

alexpott’s picture

pwolanin’s picture

Just had basically the same idea this morning (before hearing about this issue - great minds think alike) - we don't need to add @ or % output to the safe list if it wasn't there already. Could be a very easy win.

YesCT’s picture

Issue tags: +Performance

this could have memory performance benefits.

catch’s picture

Agreed, makes lots of sense.

alexpott’s picture

Status: Active » Needs review
FileSize
2.07 KB

I played with the attached patch this morning on pages with lots of strings and safe marking - eg simpletest, user permissions and module install I could see a huge difference being made by the patch. But I definitely agree that it makes sense. The double safemarkup setting in SafeMarkup::placeholder is especially silly.

joelpittet’s picture

This looks like a great idea! +1

Also made me realize that Twig uses ENT_SUBSTITUTE and we should probably do that to and there is an issue proposing that already here #1211866: Enable ENT_SUBSTITUTE flag in Html::escape so I'll cross reference.

Status: Needs review » Needs work

The last submitted patch, 11: 2506035.11.patch, failed testing.

xjm’s picture

+++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
@@ -183,11 +183,15 @@ public static function getAll() {
+  protected static function doCheckPlain($text) {

@@ -277,12 +281,21 @@ public static function format($string, array $args = array()) {
+   * @param $text
+   * @return string
+   */
+  protected static function doPlaceHolder($text) {

These need more complete method docs obviously, but looks good to me in general.

Looks like some titles are getting double-escaped in the translation UI:
kylZaFlY [<em class="placeholder">French</em> translation]

This is because of a use of ! in ContentTranslationHandler and the (IMO extremely counterintuitive) change introduced in #2399037: String::format() marks a resulting string as safe even when passed an unsafe passthrough argument where having a ! placeholder basically guarantees your entire string will be double-escaped. We can probably fix this by changing ContentTranslationHandler to use an @.

xjm’s picture

Status: Needs work » Needs review
FileSize
3.24 KB
1.17 KB

Fixed in attached. I didn't write docs yet though.

I keep meaning to file a followup for #2399037: String::format() marks a resulting string as safe even when passed an unsafe passthrough argument; I think it would be better to either revert that or remove the ! placeholder entirely rather than having it result in basically the exact opposite behavior of what people expect.

xjm’s picture

Issue summary: View changes
Issue tags: +Needs followup

Added the bit about ! to the disruption section and tagging for that followup.

xjm’s picture

+++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
@@ -277,12 +281,21 @@ public static function format($string, array $args = array()) {
+  protected static function doPlaceHolder($text) {

Minor, but I think this should be doPlaceholder() rather than doPlaceHolder() given the public method name.

xjm’s picture

xjm’s picture

Issue tags: -Needs followup
pwolanin’s picture

I don't see much value to the overhead of delegating to an internal ::doCheckPlain() method as opposed to just inlining the htmlspecialchars() call.

Let me re-roll it that way plus add the method name change you suggest.

pwolanin’s picture

FileSize
4.47 KB
438 bytes
xjm’s picture

The updated docs look great. I'm okay with this approach as well. Leaving at NR for more feedback though, especially given the added disruption for !.

+++ b/core/lib/Drupal/Component/Utility/SafeMarkup.php
@@ -268,21 +270,38 @@ public static function format($string, array $args = array()) {
+   *   The formatted text (HTML) with wrapping EM tags,

Minor: trailing comma instead of period.

pwolanin’s picture

FileSize
4.47 KB
649 bytes

typo fix

pwolanin’s picture

@xjm - are you thinking of a different issue? This patch doesn't change the ! placeholder behavior.

xjm’s picture

@pwolanin, see the disruption section of the beta eval in the issue summary and #14.

pwolanin queued 24: 2506035-24.patch for re-testing.

pwolanin queued 24: 2506035-24.patch for re-testing.

pwolanin’s picture

@xjm - I'm not sure I understand where additional double-escaping could be added on top of what might already be introduced by the ! change.

alexpott’s picture

Status: Needs review » Closed (duplicate)

This will be fixed by the larger fix #2549393: Remove SafeMarkup::escape() and rely more on Twig autoescaping - @pwolanin can you comment on that issue so you can credit (@xjm has already)