Problem/Motivation

Drupal aims to eliminate a class of security bugs by distinguishing 3 classes of text:

  • Text that is already valid safe markup is represented by an object implementing MarkupInterface.
  • A string containing plain text that is automatically escaped when passed to the template system.
  • A string containing unsafe markup that should not be passed to the template system as it would be escaped. Instead can use Xss::filter() or Xss::filterAdmin().

It's currently hard to understand how to use Token::replace() correctly in these 3 cases:

  1. The comment on the return string instructs the caller to rely on Twig autoescaping. However the return string is markup so this is wrong and will lead to double escaping.
  2. The return string is unsafe, even if the input was safe - this is potentially surprising and should be made clearer.
  3. The correct code for escaping tokens in plain text is awkward, requiring 3 nested function calls and leading to unnecessary conversions. This is easily forgotten, and is even wrong in core, see #3264453: Incorrect usage of Token::replace() can corrupt plain text that resembles HTML

This issue has been classed as a bug because the incorrect documentation and difficult usage led to bugged code in core and contrib.

Proposed resolution

  1. Fix comments to Token::replace().
  2. Create a new function replacePlain().

Remaining tasks

Deferred to a separate issue: #3264453: Incorrect usage of Token::replace() can corrupt plain text that resembles HTML

User interface changes

None

API changes

See "proposed resolution" and change record.

Data model changes

None

Comment from original issue

@Berdir #2567257-210: hook_tokens() $sanitize option incompatible with Html sanitisation requirements.

Posting this here for now, will probably open a follow-up issue.

I've been working on updating token.module and adjusting functionality/tests for this. Which is probably something we should have done before committing this, to make sure that this works for more than the few use cases that core has.

I think there is at least one example there why supporting some sort of sanitize => FALSE is useful and important.

The basic use case is when you have user-provided, unsafe input and want it to be continue unsafe and un-escaped, because you then rely on autoescape.

One example in token.module is the block label, it has this code:

function token_block_view_alter(&$build, BlockPluginInterface $block) {
  $config = $block->getConfiguration();
  $label = $config['label'];
  if ($label != '<none>') {
    // The label is automatically escaped, avoid escaping it twice.
    $build['#configuration']['label'] = \Drupal::token()->replace($label, array(), array('sanitize' => FALSE));
  }
}

The problem is that now the block label tokens are escaped twice. There's a test that is creating a node with a ' in it, and right now, that is getting escaped twice (which is exactly what this code is testing), since we force-escape all token return values and then escape the whole string again.

I don't see a proper way to fix this right now. What technically works is using PlainTextOutput::renderFromHtml() but clearly it is not correct to use that in non-plaintext output.

We don't have to pass it to hook implementations, but I really think we need a flag to prevent auto-escaping. We even document:
The caller is responsible for choosing the right escaping / sanitization but don't actually allow to caller to do that, at least not for token values. But if the token input is untrusted and will be escaped later, we must treat token replacements as untrusted too or we are guaranteed to have double-escaping problems?

(See also the following comments about some discussion and why various things don't work IMHO)

CommentFileSizeAuthor
#147 token-plain-text-2580723-interdiff-141-147.txt1.68 KBAdamPS
#147 token-plain-text-2580723-147.patch6.47 KBAdamPS
#141 token-plain-text-2580723-141.patch6.47 KBAdamPS
#137 token-plain-text-2580723-interdiff-136-137.txt422 bytesAdamPS
#137 token-plain-text-2580723-137.patch23.67 KBAdamPS
#136 token-plain-text-2580723-interdiff-133-136.txt6.17 KBAdamPS
#136 token-plain-text-2580723-136.patch23.91 KBAdamPS
#133 token-plain-text-2580723-interdiff-122-133.txt2.45 KBAdamPS
#133 token-plain-text-2580723-133.patch21.27 KBAdamPS
#122 token-plain-text-2580723-interdiff-120-122.txt1.19 KBAdamPS
#122 token-plain-text-2580723-122.patch21.21 KBAdamPS
#120 token-plain-text-2580723-interdiff-118-120.txt1.01 KBAdamPS
#120 token-plain-text-2580723-120.patch20.43 KBAdamPS
#118 token-plain-text-2580723-interdiff-116-118.txt1.53 KBAdamPS
#118 token-plain-text-2580723-118.patch19.78 KBAdamPS
#116 token-plain-text-2580723-interdiff-115-116.txt1.61 KBAdamPS
#116 token-plain-text-2580723-116.patch19.8 KBAdamPS
#115 token-plain-text-2580723-interdiff-114-115.txt476 bytesAdamPS
#115 token-plain-text-2580723-115.patch19.81 KBAdamPS
#114 token-plain-text-2580723-interdiff-105-114.txt5.47 KBAdamPS
#114 token-plain-text-2580723-114.patch20.05 KBAdamPS
#105 token-plain-text-2580723-interdiff-101-105.txt957 bytesAdamPS
#105 token-plain-text-2580723-105.patch20.27 KBAdamPS
#98 token-plain-text-2580723-interdiff-96-98.txt956 bytesAdamPS
#98 token-plain-text-2580723-98.patch20.27 KBAdamPS
#96 token-plain-text-2580723-interdiff-87-96.txt3.02 KBAdamPS
#96 token-plain-text-2580723-96.patch20.27 KBAdamPS
#87 token-plain-text-2580723-interdiff-86-87.txt523 bytesAdamPS
#87 token-plain-text-2580723-87.patch17.45 KBAdamPS
#86 token-plain-text-2580723-86.patch17.49 KBAdamPS
#85 token-plain-text-2580723-interdiff-78-85.txt3.75 KBAdamPS
#85 token-plain-text-2580723-85.patch18.07 KBAdamPS
#78 token-plain-text-2580723-interdiff-76-78.txt1.61 KBAdamPS
#78 token-plain-text-2580723-78.patch17.85 KBAdamPS
#76 token-plain-text-2580723-interdiff-70-76.txt3.08 KBAdamPS
#76 token-plain-text-2580723-76.patch18.04 KBAdamPS
#70 token-plain-text-2580723-interdiff-68-70.txt413 bytesAdamPS
#70 token-plain-text-2580723-70.patch17.15 KBAdamPS
#69 token-plain-text-2580723-interdiff-67-68.txt1.22 KBAdamPS
#69 token-plain-text-2580723-68.patch17.74 KBAdamPS
#67 token-plain-text-2580723-interdiff-65-67.txt3.08 KBAdamPS
#67 token-plain-text-2580723-67.patch17.44 KBAdamPS
#65 token-plain-text-2580723-interdiff-64-65.txt372 bytesAdamPS
#65 token-plain-text-2580723-65.patch18.4 KBAdamPS
#64 token-plain-text-2580723-interdiff-61-64.txt15.24 KBAdamPS
#64 token-plain-text-2580723-64.patch18.41 KBAdamPS
#61 token-plain-text-2580723-interdiff-58-61.txt2.28 KBAdamPS
#61 token-plain-text-2580723-61.patch24.04 KBAdamPS
#58 token-plain-text-2580723-interdiff-56-58.txt1.66 KBAdamPS
#58 token-plain-text-2580723-58.patch24.03 KBAdamPS
#56 token-plain-text-2580723-interdiff-55-56.txt2.43 KBAdamPS
#56 token-plain-text-2580723-56.patch23.11 KBAdamPS
#55 token-plain-text-2580723-interdiff-54-55.txt13.54 KBAdamPS
#55 token-plain-text-2580723-55.patch21.79 KBAdamPS
#54 token-plain-text-2580723-interdiff-53-54.txt808 bytesAdamPS
#54 token-plain-text-2580723-54.patch11.38 KBAdamPS
#53 token-plain-text-2580723-interdiff-51-53.txt1.14 KBAdamPS
#53 token-plain-text-2580723-53.patch11.38 KBAdamPS
#52 token-plain-text-2580723-interdiff-41-51.txt7.41 KBAdamPS
#52 token-plain-text-2580723-51.patch11.33 KBAdamPS
#41 2580723-41.patch6.6 KBandypost
#41 interdiff.txt912 bytesandypost
#40 token-plain-text-2580723-interdiff-36-39.txt3.96 KBAdamPS
#40 token-plain-text-2580723-interdiff-39-40.txt1.64 KBAdamPS
#40 token-plain-text-2580723-40.patch6.5 KBAdamPS
#39 token-plain-text-2580723-39.patch6.44 KBAdamPS
#36 token-plain-text-2580723-36.patch2.67 KBAdamPS
#25 token-plain-text-2580723-interdiff-23-25.txt6.14 KBAdamPS
#25 token-plain-text-2580723-25.patch2.94 KBAdamPS
#23 token-plain-text-2580723-diff-9-23.txt3.99 KBAdamPS
#9 token-plain-text-2580723-9.patch4.5 KBBerdir
#23 token-plain-text-2580723-23.patch5.92 KBAdamPS
#2 token-escape-option-2580723-2.patch1.95 KBBerdir
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Berdir created an issue. See original summary.

Berdir’s picture

Status: Active » Needs review
FileSize
1.95 KB

Example patch, this fixes token_block_view_alter().

Berdir’s picture

Another name for that option could be plain_text (or plain-text or plaintext or plain) => TRUE/FALSE wher the default would be FALSE.

dawehner’s picture

Did you considered to use the don't encode me twice option on htmlspecialchars()

+++ b/core/lib/Drupal/Core/Utility/Token.php
@@ -194,6 +197,11 @@ public function replace($text, array $data = array(), array $options = array(),
+    $escape = !isset($options['escape']) || $options['escape'] === TRUE ? TRUE: FALSE;

I really try to figure out why we can't just use !empty($options['escape']

catch’s picture

What happens to Safestrings?

Berdir’s picture

#4: Because it was late ;)

#5. Hm, yes, right now, they'd get escaped again. And they wouldn't with escape + renderFromHtml() But I'm not sure what you'd expect to happen if you use a token that prints out markup, like a formatted node body for a block label? Should that really work? It certainly works the same way right now in D7, if you have sanitize FALSE and HTML in your body, then it will be escaped in the block label, so it's certainly not a regression in any way.

Maybe that's something that a better name for the option could clarify, as I said in #3, something with plaintext or userinput in it. Then that might make it clearer that it's not just about esaping or not but that it's really about treating the text and the tokens as unsafe user input that must and will be escaped later if printed in a HTML context.

Berdir’s picture

Issue tags: +Needs tests

#5: Actually they would as well with renderFromHtml() because the result of that isn't safe, so it'd be escaped as well. Really no difference and no way to make safe strings work in plain text token replacements as far as I see.

catch’s picture

Status: Needs review » Needs work

5. Hm, yes, right now, they'd get escaped again. And they wouldn't with escape + renderFromHtml()

Right, so the problem here is that we'd be mixing strings that are ready-for-HTML (safe string) and strings which are ready-for-escaping (plain text) into a single string replaced as part of the $text argument, with no knowledge of which is which

It's a problem in D7 too, and was a problem that was brought up in the original token issue before it was committed (but was eventually ignored), and we've just spent about a month fixing it. renderFromHtml() will strip tags as well as entity decoding, so you don't get 'double escaped HTML entities' when you use it. It might not be perfect for all use cases, but it at least does what it says it will.

and the tokens as unsafe user input that must and will be escaped later if printed in a HTML context.

The tokens are either plain text (to be escaped) or HTML, they are not uniformly safe or unsafe.

Berdir’s picture

Status: Needs work » Needs review
FileSize
4.5 KB

Discussed this a bit with @catch. Maybe we want a separate method to do plain text only replacements and convert all token replacements to plain text?

Status: Needs review » Needs work

The last submitted patch, 9: token-plain-text-2580723-9.patch, failed testing.

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Berdir’s picture

I'm running into this problem over and over again.

The only solution that worked as expected for page_manager is this:

    // Token replace only escapes replacement values, ensure a consistent
    // behavior by also escaping the input and then returning it as a Markup
    // object to avoid double escaping.
    // @todo: Simplify this when core provides an API for this in
    //   https://www.drupal.org/node/2580723.
    $title = $this->token->replace(new HtmlEscapedText($page_title), $data);
    return Markup::create($title);

See #2668980: Certain characters in token page titles are double-encoded

Instead of not escaping anymore, we treat the input as unsafe, then escape both the text and the token replacements consistently and then treat the resulting string as safe markup, to prevent double escaping. That was the *only* way to avoid double-escaping consistently when having for example a & in the node title with [node:title] or just putting a & directly in the page variant title.

I suppose the core issue could check if the input is safe markup, then use that as-is and otherwise escape.

Especially if the input is safe markup, then I suppose you could still end up with e.g. tokens in attributes being problematic and unsafe, but honestly, I have no idea how you could make that safe. And it's certainly better than the API's we offer now, that result in double-escaping and other problems for everyone that tries to use tokens and put the output on an HTML page.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

andypost’s picture

Yep, this issue becomes more annoying because page templates escapes again, so sometimes you need to hack like:

- {% set title = head_title|safe_join(' | ') %}
+ {% set title = head_title['title'] %}

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.6 was released on August 2, 2017 and is the final full bugfix release for the Drupal 8.3.x series. Drupal 8.3.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.4.0 on October 4, 2017. (Drupal 8.4.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.4.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.4 was released on January 3, 2018 and is the final full bugfix release for the Drupal 8.4.x series. Drupal 8.4.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.5.0 on March 7, 2018. (Drupal 8.5.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.5.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.6 was released on August 1, 2018 and is the final bugfix release for the Drupal 8.5.x series. Drupal 8.5.x will not receive any further development aside from security fixes. Sites should prepare to update to 8.6.0 on September 5, 2018. (Drupal 8.6.0-rc1 is available for testing.)

Bug reports should be targeted against the 8.6.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

AdamPS’s picture

I agree the assumption to always run Html::escape is definitely too restrictive. For module RRSSB the output of the tokens is used in a URL, so the correct filter function is rawurlencode.

A workaround is possible like this:

$token_service->replace($text, $data, ['callback' => '_token_decode']);

function _token_decode(&$replacements, $data, $options) {
  $replacements = array_map(function($s) { return htmlspecialchars_decode($s, ENT_QUOTES); }, $replacements);
}

I'm not sure the patch form #9 is the way to go - it seems to duplicate the whole function hard-coding a different filter function PlainTextOutput::renderFromHtml. Developers will need different filter functions in different cases and surely we don't want to copy the whole function again and again. It seems better to take the patch from #2 subject to comments in #4 which is also more consistent with D7.

Right, so the problem here is that we'd be mixing strings that are ready-for-HTML (safe string) and strings which are ready-for-escaping (plain text) into a single string replaced as part of the $text argument, with no knowledge of which is which

In that case you pass a callback function that does the escaping of each token.

However the original case raised by @Berdir suggests that in other cases, the whole return string is ready-for-escaping user-text and will be escaped later, perhaps by TWIG.

AdamPS’s picture

Title: Add an argument to Token::replace() to disable escaping » Token::replace() mixes escaped/unescaped and forces potentially unwanted encoding
Category: Task » Bug report
Issue summary: View changes
Priority: Normal » Major

OK I've had a go at updating the IS based on my experience and the other comments. I feel this can reasonably be viewed as a major bug, because it affects the whole token system and, except where developer have written hacks like #14, there will be double-escaping.

Core experts please tell me if I've got it all wrong:-)

AdamPS’s picture

Issue summary: View changes
AdamPS’s picture

Issue summary: View changes

OK I think this needs input from a core committer / expert before it can go further.

1) I belatedly notice that the comment for $text does clearly state the the caller must pass HTML text, so that's not a bug. It does seem a bit inconsistent and unhelpful though - why not escape $text automatically too?

2) The return string has a comment like this

The token result is the entered HTML text with tokens replaced. The caller is responsible for choosing the right escaping / sanitization.

That doesn't seem true. The text has already been escaped and the caller mustn't escape it again. Hence the return value should be converted to MarkupInterface.

3) And the comment continues like this

If the result is intended to be used as plain text, using PlainTextOutput::renderFromHtml() is recommended.

That procedure will fail to generate the correct string in many cases. A much closer reversal of Html::escape is to call htmlspecialchars_decode, but even that is not perfect. We need a way to use tokens directly on plain text - either a parameter on the existing method or a separate one.

AdamPS’s picture

Title: Token::replace() mixes escaped/unescaped and forces potentially unwanted encoding » Token::replace() leads to double-escaping and unwanted escaping
Issue summary: View changes
Status: Needs work » Needs review
FileSize
5.92 KB
3.99 KB

I've had a go at a patch - comments welcome.

Sorry my interdiff tool fails, so I've uploaded a diff. In essence there are two main changes from #9

  • replace() returns MarkupInterface
  • replacePlainText() only converts to plain text if MarkupInterface

Status: Needs review » Needs work

The last submitted patch, 23: token-plain-text-2580723-23.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

AdamPS’s picture

Title: Token::replace() leads to double-escaping and unwanted escaping » Token::replace() leads to double-escaping
Issue summary: View changes
Status: Needs work » Needs review
FileSize
2.94 KB
6.14 KB

I suspect D8 has many double-escaping bugs relating to this issue. However many of them are not understood/noticed, because most of the time the text doesn't contain a quote.

I see double-escaping in user notification mails when they are formatted as HTML mails, e.g. by swiftmailer. This lead me to notice that in two places user_mail() is missing the required call to Html::escape() on the $text parameter.

I have updated the issue summary and created a new patch that implements option 1.

I removed Token::replacePlainText() from the patch for the moment because that seems secondary. Whilst it would be a convenience wrapper and better performance, it's not a bug. The current interface gives the correct result for plain text if used as the comments require:

PlainTextOutput::renderFromHtml(Token::replace(Html::escape($text)));

Status: Needs review » Needs work

The last submitted patch, 25: token-plain-text-2580723-25.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

AdamPS’s picture

Issue summary: View changes

From the test failures, it seems that option 1 is not going to be possible because of BC. However the broken tests are generally highlighting broken code!

LinkFieldTest: LinkFormatter::viewElements passes the title (which is not HTML) without calling Html::escape. This line has a comment "Unsanitized token replacement here" but the comment for Token::replace() doesn't allow such a usage. This code is exactly what #2567257: hook_tokens() $sanitize option incompatible with Html sanitisation requirements intended to avoid - it mixes unescaped $text with token replacements that might contain markup.

LocaleStringIsSafeTest: fails because the input strings are encoded HTML but not MarkupInterface. That's not generally correct in D8 and will often lead to double-encoding. The strings are in test code so I guess it didn't matter.

I added option 3: wait until D9 then we can make a non-BC change.

AdamPS’s picture

Issue summary: View changes
matthiasm11’s picture

Related issues: +#3026041: Token escaping

Version: 8.6.x-dev » 8.8.x-dev

Drupal 8.6.x will not receive any further development aside from security fixes. Bug reports should be targeted against the 8.8.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.9.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

bkosborne’s picture

We ran into this problem with the CAS Attributes module, which uses the Token utility service to perform token replacement on strings that are intended as field values for the user entity, and thus should have no sanitizing at all applied to them. Currently the module works around this by calling Html::decodeEntities() on the text that was passed thru the service to reverse it. Would be great to be able to tell Token utility to just leave the text alone, which I think is a use case described in the issue summary.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.7 was released on June 3, 2020 and is the final full bugfix release for the Drupal 8.8.x series. Drupal 8.8.x will not receive any further development aside from security fixes. Sites should prepare to update to Drupal 8.9.0 or Drupal 9.0.0 for ongoing support.

Bug reports should be targeted against the 8.9.x-dev branch from now on, and new development or disruptive changes should be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

AdamPS’s picture

Issue summary: View changes

Added an option 3

AdamPS’s picture

Issue summary: View changes

Darn, I am repeating myself. Option 3 was tried in patch #25, but it's wrong. There is a clear precedent for passing unsafe markup to Token::replace() so we need to allow for that.

AdamPS’s picture

Version: 8.9.x-dev » 9.3.x-dev
Priority: Major » Normal
AdamPS’s picture

Status: Needs work » Needs review
FileSize
2.67 KB
andypost’s picture

It looks very promising for page titles (is what I mean in #14), just needs to update tests to make sure that text is different object

+++ b/core/lib/Drupal/Core/Utility/Token.php
@@ -221,7 +221,8 @@ public function replace($text, array $data = [], array $options = [], Bubbleable
+    return ($text instanceof MarkupInterface) ? Markup::create($result) : $result;

it needs inline comment about why new object created.

Status: Needs review » Needs work

The last submitted patch, 36: token-plain-text-2580723-36.patch, failed testing. View results

AdamPS’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests
FileSize
6.44 KB
AdamPS’s picture

andypost’s picture

AdamPS’s picture

Issue summary: View changes

Thanks for #41. I feel that this is ready for RTBC. Apart from that last one line change the patch came from me, so it feels better if you are the reviewer - OK with you?

@Berdir, if you are still following this one, it would be great to get clarification from you as raiser that this patch solves your original problem - thanks.

AdamPS’s picture

Issue summary: View changes
AdamPS’s picture

Status: Needs review » Reviewed & tested by the community

Unblocking this one: AdamPS and andypost have both contributed to it and reviewed each others contribution so it seems fair to set RTBC.

darvanen’s picture

I've run over the code and have no feedback, +1 RTBC.

I agree with the follow-up issues suggested in remaining tasks > proposed resolution.

darvanen’s picture

I think this should go to a framework manager who might know more about possible flow-on effects since we're changing the output of the replace() method.

How confident are we that "calling code does not ever need to be updated because of this patch" per the summary?

Berdir’s picture

Oh I forgot I created this issue originally ;)

You want to test this patch with modules like metatag and pathauto and feed them both actual HTML (e.g. in body field that you use as description metadata) and also special characters like a < and verify that the output remains unchainged. Various contrib modules are doing quite a dance right now with the token replace output to make things work. That might or might not be affected, but it is possible that it can be simplified once this issue happened and they can depend on a corresponding core version.

darvanen’s picture

Well, here's the result for metatag using the Basic HTML filter in article (node) body from standard installation:

Value:

<p>There's HTML in this body including <strong>strong</strong>, <em>emphasis</em>, &amp; a link<a href="#">#</a>.</p>

Output prior to patch: <meta name="description" content="There&#039;s HTML in this body including strong, emphasis, &amp; a link#." />

Output after patch: <meta name="description" content="There&#039;s HTML in this body including strong, emphasis, &amp; a link#." />

I'm not entirely certain what that tells us about this patch. That metatag doesn't require it?

larowlan’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: -Needs framework manager review

Adding issue credits first up

+++ b/core/modules/locale/tests/src/Kernel/LocaleStringIsSafeTest.php
@@ -54,19 +55,21 @@ public function testLocalizedTokenizedString() {
-        'replaced' => 'Go to the &lt;a href=&quot;javascript:alert(&amp;#039;Mooooh!&amp;#039;);&quot;&gt;frontpage&lt;/a&gt;',
+        'replaced' => 'Go to the <a href="javascript:alert(&#039;Mooooh!&#039;);">frontpage</a>',
...
-        'replaced' => 'Hello &lt;strong&gt;&amp;lt;script&amp;gt;alert(&amp;#039;Mooooh!&amp;#039;);&amp;lt;/script&amp;gt;&lt;/strong&gt;!',
+        'replaced' => 'Hello <strong>&lt;script&gt;alert(&#039;Mooooh!&#039;);&lt;/script&gt;</strong>!',

Whilst I realise that this change is because the test is wrong, we were passing a safe string to token and it was being double-escaped, I think there would be plenty of projects out there that are relying on this incorrect behaviour as their last line of defense.

For that reason, I think we should go for the complete patch here. I.e adding a second method for replaceMarkup and adding the new logic there. We can typehint the text param with MarkupInterface.

We can then add a deprecation warning in replace if someone passes in anything other than a string.

This then allows us to typehint the text param in D10 with string.

I realise we're getting close to the end of the road for D9 deprecations, so we'd need to move fast on that. Feel free to ping me for followups.

darvanen’s picture

#49: Agree. Thanks for putting your finger on what was making me uneasy about the current approach.

AdamPS’s picture

Issue summary: View changes
Status: Needs work » Needs review

Thank you all for good feedback. It seems we are all agreed - let's do this properly. I propose that it we are going down the route of deprecations and forcing people to change code then let's do it 100% properly.

New patch creates three new functions replaceMarkup(), replacePlain() and replaceUnsafe() and deprecates the original replace() function. See change record.

This makes the new interface extremely clear and forces all calling code to examine their use case and pick the appropriate function. Type hinting string @param is worth doing, but note that PHP automatically converts MarkupInterface to string as required (I just ran a test to confirm).

There will be lots of test fails however I would like to get feedback on the general approach before going further into it.

AdamPS’s picture

AdamPS’s picture

AdamPS’s picture

AdamPS’s picture

Quick easy-win attempt to fix many of the failing tests - but certainly not all.

AdamPS’s picture

Status: Needs review » Needs work

The last submitted patch, 56: token-plain-text-2580723-56.patch, failed testing. View results

AdamPS’s picture

AdamPS’s picture

Title: Token::replace() leads to double-escaping » Unclear documentation and incorrect usage of Token::replace()
Issue summary: View changes

To reviewers - summary of my direction and reasoning:

  1. We agreed to add replaceMarkup().
  2. Therefore it seems to make sense also to add replacePlain() as a convenience wrapper (was previously a follow on step).
  3. In which case we should complete the trio and add replaceUnsafe() to give symmetry and interface clarity.
  4. At which point, we can deprecate replace() entirely, rather than selectively detect some specific cases This should ensure that all developer calling the interface take a good look at their code and choose the correct new function.
  5. Benefit: we are making all the deprecations and interface changes in one go, rather than annoying people again later.
  6. Benefit: we are fixing all the problems rather than leaving parts to follow on issues

However it does make the patch much bigger. And in some cases it was a little tricky to be sure which of the new functions to call, i.e. what was contained in $text.

Status: Needs review » Needs work

The last submitted patch, 58: token-plain-text-2580723-58.patch, failed testing. View results

AdamPS’s picture

Status: Needs review » Needs work

The last submitted patch, 61: token-plain-text-2580723-61.patch, failed testing. View results

larowlan’s picture

I think deprecating ::replace is too disruptive.

As above, I would prefer we only emitted a deprecation error if the passed argument is an object, pointing people to the new replaceMarkup

::replace can just proxy to ::replacePlain or ::replaceMarkup (with the trigger_error)

That way for people using it correctly there's no disruption. As you point out, the casting to string means that we can't type-hint the text argument in D10, but we can throw a Type error - https://3v4l.org/qCJsV

Forcing all the correct calls to ::replace to instead use ::replacePlain feels like a lot of busy-work - what do others think?

AdamPS’s picture

Thanks @larowlan yes I think you are right. Here is a new patch which is simpler and feels good.

With this patch Token::replace() is for use on unsafe markup only. We still have 3 separate functions for the 3 cases. However by keeping the name as replace() instead of replaceUnsafe() it saves a lot of work.

Status: Needs review » Needs work

The last submitted patch, 65: token-plain-text-2580723-65.patch, failed testing. View results

AdamPS’s picture

The deprecation seems to trigger all over the place - I don't yet understand why. Let's try without it.

AdamPS’s picture

It seems that WidgetInterface->getFilteredDescription() sometimes returns MarkupInterface and sometimes string.

So I guess that calling replace() is OK provided that you treat the result as unsafe. It doesn't really matter if the input happened to be safe.

replaceMarkup() should be called when you know the input is safe and want the result to be.
replacePlain() should be called to apply conversion to markup and back.

AdamPS’s picture

Same as #67 but with the commented out code removed.

AdamPS’s picture

Status: Needs review » Needs work

The last submitted patch, 70: token-plain-text-2580723-70.patch, failed testing. View results

AdamPS’s picture

Status: Needs work » Needs review
larowlan’s picture

This is looking really good, much smaller change and far less disruption.

+++ b/core/lib/Drupal/Core/Utility/Token.php
@@ -174,15 +175,79 @@ public function __construct(ModuleHandlerInterface $module_handler, CacheBackend
+   *   The entered unsafe markup with tokens replaced. The caller is
+   *   responsible for choosing the right sanitization, for example the result
+   *   can be put into #markup, in which case it would be sanitized by
+   *   Xss::filterAdmin().

I think we should add @see links to the two new methods here.

I also think we're still missing the trigger_error when $unsafe is an object that implements MarkupInterface.

Berdir’s picture

+++ b/core/lib/Drupal/Core/Utility/Token.php
@@ -174,15 +175,79 @@ public function __construct(ModuleHandlerInterface $module_handler, CacheBackend
+  public function replaceMarkup(?MarkupInterface $markup, array $data = [], array $options = [], BubbleableMetadata $bubbleable_metadata = NULL) {
+    $result = $this->doReplace($markup, $data, $options, $bubbleable_metadata);
+    // All replacements were escaped and the input string is safe so the return
+    // string is also safe.
+    return Markup::create($result);

supporting null here seems really strange. I understand that we supported it until now and there are probably edge cases calling it like that with a string or null, but IMHO, for the new methods, we should be more strict. At the very least with the markup object, why would you have a use case where you pass in NULL there? If you need to deal with null or empty string then you do that with what you pass to Markup::create()?

AdamPS’s picture

Thanks for the comments. I'll load another patch soon when I have more time.

I also think we're still missing the trigger_error when $unsafe is an object that implements MarkupInterface.

Please see #68. I was really attracted to adding that but it seems difficult without major disruption - there are common functions in core that return MarkupInterface|string, especially FieldStorageDefinitionInterface::getDescription() (even though the documentation says the return value is string|null).

supporting null here seems really strange.

Yes I found it strange at first too. However when I thought more, it seemed similar to above. You are imagining the call to Token::replace() is preceded by a call to Markup::create() but often it won't be true as the markup comes from a function. StylePluginBase::getField() is documented to return MarkupInterface|null, and maybe other functions not explicitly documented. So either Token::replace*() handles null or lots of calling code has to check for themselves.

AdamPS’s picture

OK I'll have one more try at the ideas in #73 and #74 to see if I can get them to work.

Status: Needs review » Needs work

The last submitted patch, 76: token-plain-text-2580723-76.patch, failed testing. View results

AdamPS’s picture

#74 Done

#73 I'm afraid I don't see any safe way to add the deprecation

AdamPS’s picture

Re#73, to recap my thinking from #68:

  • replaceMarkup() should be called when you know the input is safe and want the result to be.
  • replacePlain() should be called to apply conversion to markup and back.
  • replace() should be called when you treat the result as unsafe and perhaps it doesn't really matter if the input happened to be safe sometimes (no harm in filtering again).

In particular the problems is that WidgetInterface->getFilteredDescription() sometimes returns MarkupInterface and sometimes string.

alexpott’s picture

  1. It's great to see this issue already exists - definitely agree we need to so something.
  2. +++ b/core/lib/Drupal/Core/Utility/Token.php
    @@ -174,15 +175,82 @@ public function __construct(ModuleHandlerInterface $module_handler, CacheBackend
    +  public function replacePlain(string $plain, array $data = [], array $options = [], BubbleableMetadata $bubbleable_metadata = NULL) {
    +    $result = $this->doReplace(Html::escape($plain), $data, $options, $bubbleable_metadata);
    +    return PlainTextOutput::renderFromHtml($result);
    +  }
    

    I've recently discovered this is not this simple.

    For example if you but html into site name and site slogan this will cause the html tags in the site slogan to be removed and it'll cause the html in the site name to be escaped and then unescaped - resulting in the return looking like html!

  3. +++ b/core/lib/Drupal/Core/Utility/Token.php
    @@ -234,6 +302,10 @@ public function replace($text, array $data = [], array $options = [], Bubbleable
    +    if (is_null($text)) {
    +      return [];
    +    }
    

    Why is this here?

One thought is that in HEAD in Token::replace() we do:

    // Escape the tokens, unless they are explicitly markup.
    foreach ($replacements as $token => $value) {
      $replacements[$token] = $value instanceof MarkupInterface ? $value : new HtmlEscapedText($value);
    }

This is the point where I think we need different behaviour depending on whether we want plain html free output or output intended from markup.

For the plain case we should be doing:

    // Escape the tokens, unless they are explicitly markup.
    foreach ($replacements as $token => $value) {
      $replacements[$token] = $value instanceof MarkupInterface ? PlainTextOutput::renderFromHtml($value) : new HtmlEscapedText($value);
    }

... and then not calling PlainTextOutput::renderFromHtml() on the final result. This would mean that HTML in the site title would be escaped and HTML in site slogan would be stripped. This is how it would appear in the browser so I think this is correct. It think it is possible to argue that we could do the following instead...

    // Strip all HTML tags and decode any entities into UTF-8.
    foreach ($replacements as $token => $value) {
      $replacements[$token] = PlainTextOutput::renderFromHtml($value);
    }

As this would result in more consistent plain output.

AdamPS’s picture

Thanks for your comments @alexpott

2.

if you put html into site name...

All the evidence I can see indicates that site name is a plain text config entry rather than an HTML one:

  • I entered text containing HTML entities (You and Me) for site name and the HTML entities were escaped and visible in my page header.
  • system_tokens returns a plain string for site:name in contrast to site:slogan which returns markup.

resulting in the return looking like html!

Therefore I would say this is exactly correct. If my site name is "Rock & Roll", "The HTML

ision" (a site teaching about HTML??) or "Your buys", etc then the value might resemble HTML, but it isn't - the entities should be escaped, exactly as they would be when typing into a field with text format plain_text.

I believe "the return looking like html" is how the code behaves now, and any change would be a BC-break.

3.

Why is this here?

Because the first parameter to str_replace is not nullable, so passing NULL would trigger a deprecation error from PHP8.1 - see RFC.

However I agree it's not really part of this issue so I can take it out and allow someone to fix it separately if that is preferred.

One thought is that ...

I would say that for the plain text case, we could be doing:

    // Convert tokens to plain text if they are explicitly markup.
    foreach ($replacements as $token => $value) {
      $replacements[$token] = $value instanceof MarkupInterface ? PlainTextOutput::renderFromHtml($value) : $value;
    }

I removed the call to HtmlEscapedText as it would create a bug with unwanted or double escaping - the return value of replacePlain() is plain text, so for example it would be escaped again in a TWIG template.

The alternative code is likely slightly better performance, but I claim it gives exactly the same result. The original code converts the plain text token value to markup and then converts back to plain text. The optimised code doesn't convert at all.

alexpott’s picture

@AdamPS if you put <strong>Site</strong> as your site name it will appear as escaped html on you site - ie you will see <strong>Site</strong> and in the HTML code it will be &lt;strong&gt;Site&lt;/strong&gt;. Currently if you do PlainTextOutput::renderFromHtml() on a token with that site name then the eventual output will have the HTML <strong>Site</strong> in it.

This behaviour is not only surprising - it is a bit dangerous. Consider the following:

>>> \Drupal::configFactory()->getEditable('system.site')->set('name', '<strong>My site</strong>')->set('slogan', '<strong>Slogan!!!!</strong>')->save();
=> Drupal\Core\Config\Config {#4455
     uuid: "f61d6ef6-4d63-4dec-ac91-0c9862fa9fc7",
     name: "<strong>My site</strong>",
     mail: "admin@example.com",
     slogan: "<strong>Slogan!!!!</strong>",
     page: [
       403 => "",
       404 => "",
       "front" => "/user/login",
     ],
     admin_compact_mode: false,
     weight_select_max: 100,
     langcode: "en",
     default_langcode: "en",
     _core: [
       "default_config_hash" => "AyT9s8OUcclfALRE_imByOMgtZ19eOlqdF6zI3p7yqo",
     ],
   }
>>> \Drupal::token()->replace('[site:name] [site:slogan]');
=> "&lt;strong&gt;My site&lt;/strong&gt; <strong>Slogan!!!!</strong>"
>>> \Drupal\Component\Render\PlainTextOutput::renderFromHtml(\Drupal::token()->replace('[site:name] [site:slogan]'));
=> "<strong>My site</strong> Slogan!!!!"
AdamPS’s picture

@alexpott Thanks for commenting again.

I ran your code on a site without the patch from this issue, and I got exactly the same result as you did. Is that what you expected? In other words, are you making a criticism of how the existing code works?

The site name and slogan seem to have different behaviour - the name is treated as a plain text field and the slogan as markup. I also find this surprising and there is no indication in the UI. In my tests it appeared that markup for the slogan was filtered for example to remove script tags.

Please can you explain what you would like to happen here?

  1. What change would you like to see?
  2. How could we achieve it in a BC way? It seems to me that existing sites could be relying on the current behaviour. If the site name is "Rock & Roll" then they would not need to escape the ampersand. With the same slogan, then they would need to escape it. Site admins may have noticed this behaviour and compensated for it.)
  3. How would you expect the changes to be divided into different issues? The main thrust of this issue so far is about the general token interface rather than addressing specific tokens.
alexpott’s picture

@AdamPS I think that the renderPlain() added by this patch needs to return "My site Slogan!!!!" for the example in #82. Under hood it needs to do

    // Strip all HTML tags and decode any entities into UTF-8.
    foreach ($replacements as $token => $value) {
      $replacements[$token] = PlainTextOutput::renderFromHtml($value);
    }

renderPlain does not need to care about our MarkupObject as far as I can see.

Another possible change would be do change \Drupal\Component\Render\PlainTextOutput

  public static function renderFromHtml($string) {
    return Html::decodeEntities(strip_tags((string) $string));
  }

to

  public static function renderFromHtml($string) {
    return strip_tags(Html::decodeEntities(strip_tags((string) $string)));
  }

That said the problem here is that \Drupal\Component\Render\PlainTextOutput() means render to context where HTML has no meaning. It does not mean remove all html no matter what.

AdamPS’s picture

@AdamPS I think that the renderPlain() added by this patch needs to return "My site Slogan!!!!" for the example in #82.

Great thanks for explaining. So I'm afraid that I don't agree - your argument appears very plausible but I have studied this matter very carefully from my perspective as maintainer of simplenews and swiftmailer modules. The problem with what you suggest is that it would strip perfectly legitimate plain text values that resemble HTML - in particular anything containing a "less than" followed by a letter.

Going back to your paragraph from #82, you said this:

if you put Site as your site name it will appear as escaped html on you site - ie you will see Site and in the HTML code it will be <strong>Site</strong>. Currently if you do PlainTextOutput::renderFromHtml() on a token with that site name then the eventual output will have the HTML Site in it.

In the first case, the site name has been displayed in an HTML page, so it has been correctly escaped.

In the second case, the key point is that the return value of renderPlain() is plain text - not HTML. So before it is ever displayed on an HTML page, it will be escaped, for example by TWIG auto-escaping. If we already escaped it, then it would be escaped again, which is of course wrong as the escape codes would be displayed on the page. Or the text might be put into a plain text mail. If we escaped it, the mail agent will show the escape codes.

I have uploaded a new patch that improves the code based on your useful comments in #81.

AdamPS’s picture

Re-roll

Clarification of previous comment. It might be clearer if we take a differ example, such as "Your <best> buys" (I tried to type this above but apologies I forgot to escape the ampersands).

  1. The site name will display on the page including the word best inside angle brackets
  2. It's the same using replacePlain() from the latest patch
  3. It's the same in the token output from code you would use prior to this patch (see below, taken from CR)
PlainTextOutput::renderFromHtml($token_service->replace(Html::escape($plain)));

This token replacing code is not just for site-name of course - it's used for any token such as [node:title]. These can reasonably contain a < for technical sites about HTML programming or maths/science.

AdamPS’s picture

darvanen’s picture

I agree with @AdamPS that I would expect replacePlain() to return "Your <best> buys" rather than "Your buys" if I had entered the former as my site name.

Is it still the plan to deprecate passing Markup to replace()? Or does the IS need an update?

darvanen’s picture

Just want to take a step back here and examine #2578569: Move token sanitization from Token::replace() to Token::generate().

Am I right in thinking that is attempting to solve the same problem as this issue? and if so, is it a better approach or not?

There's a related ticket #2577845: Ensure that all core tokens return SafeStringInterface if needed that I think should be a follow-up for whichever of these methods we go with (assuming crossover of purpose). Or maybe it's a duplicate?

AdamPS’s picture

Thanks @darvanen for taking a look at this issue.

I would say there are 3 separate problems that all need fixing. They could be fixed in any order (it's not really that one "follows-up" another).

If you are in agreement, any chance you could set the issue to RTBC please?

AdamPS’s picture

Issue summary: View changes

Is it still the plan to deprecate passing Markup to replace()? Or does the IS need an update?

I don't see how we can - there is core code that returns ambiguously either string or MarkupInterface. It would just force calling code to put in (string) casts which wouldn't really achieve anything.

Edit: I updated the IS.

darvanen’s picture

@larowlan you called for the deprecation warning in #49 and I tend to agree. What do you think of #91?

larowlan’s picture

I'd be interested in seeing what failed if we added the deprecation so we could gauge how much misuse there is in core
It may not end up being the final patch (although I still feel that it should be)

Kind of like #3232018: Trigger an error when using \Drupal\Core\Cache\RefinableCacheableDependencyTrait::addCacheableDependency with a non CacheableDependencyInterface object which shows us all the misuse of that API in core.

darvanen’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

#90: Ok cool, we can worry about those later, or in parallel, as you say.

#91 / #93: I see we've been around the mulberry bush a few times on this and have some competing priorities which must be a bit frustrating. I've gone back through the comments and I think @larowlan you might still be thinking ::replace should eventually be proxied to ::replaceString while @AdamPS has left it in place instead of creating a new ::replaceUnsafe method.

There is a failing patch which demostrates the scope of the problem at #65.

So once again I've come around to your point of view @AdamPS :) thanks for updating the IS.

I think if we did want to do away with the "magic" (but also somewhat broken) ::replace method that could be addressed in a separate issue. This issue will enable change, we don't have to carry out all of that change at once.


However, we can't make this RTBC without test coverage of the new API surface, that's just asking for it to be sent back :)

I think \Drupal\Tests\Core\Utility\TokenTest would be a good place to specficially lock in the behaviour of the new methods.

larowlan’s picture

ok thanks @darvanen, I've been in and out of this issue so if you, Adam and Alex have moved on and have a consensus of how to go forward - ignore me

AdamPS’s picture

Thanks for the comments. Here is the test.

I also added one last try at the deprecation - it looks like some recent fixes in other issues have solved some of the problems and I fixed one more place in views.

AdamPS’s picture

FYI I looked more closely at #2578569: Move token sanitization from Token::replace() to Token::generate() and I propose it could be marked as "Closed (works as designed)"

AdamPS’s picture

AdamPS’s picture

Issue summary: View changes

Wow - that was easier than I expected! The deprecation is back in, I have adjusted the CR and IS.

AdamPS’s picture

Version: 9.3.x-dev » 9.4.x-dev
larowlan’s picture

This is looking good to me, it would be good for @alexpott to cast another eye over the latest changes and responses to his earlier feedback

darvanen’s picture

Looking good to me too.

#99: Well that's good news! I must say I'm a tiny bit suspicous of test failures disappearing like that but if you've found recent work to explain it that's good. Perhaps you could provide one example of where this has happened? Or if that was an assumption we can go looking.

Review of whole patch at #98:

  1. +++ b/core/lib/Drupal/Core/Utility/Token.php
    @@ -174,15 +174,83 @@ public function __construct(ModuleHandlerInterface $module_handler, CacheBackend
    +      @trigger_error('Passing MarkupInterface to Token::replace() is deprecated in drupal:9.3.0 and is removed from drupal:10.0.0. Use replaceMarkup() or replacePlain() instead. See https://www.drupal.org/node/3232491', E_USER_DEPRECATED);
    

    This will now need to reference 9.4.0 and 11.0.0 instead.

  2. +++ b/core/modules/locale/tests/src/Kernel/LocaleStringIsSafeTest.php
    @@ -75,18 +78,23 @@ public function testLocalizedTokenizedString() {
    +      $tokenized_string = \Drupal::token()->replaceMarkup($safe_string);
    +      $this->assertTrue($tokenized_string instanceof MarkupInterface);
    +      $this->assertTrue($tokenized_string !== $safe_string, 'New markup object created from original');
    +      $rendered_tokenized_string = \Drupal::theme()->render('locale_test_tokenized', ['content' => $tokenized_string]);
    +      // After token replacement, string should still be marked as safe. Check
    +      // it is escaped the way we expect.
    +      $this->assertTrue($rendered_tokenized_string instanceof MarkupInterface);
    +      $this->assertEquals($test['replaced'] . "\n", $rendered_tokenized_string, 'Security test ' . $i . ' after translation after token replacement');
    

    Thanks for going the extra mile on this, much easier to read and understand.

  3. +++ b/core/modules/views/src/Plugin/views/area/Text.php
    @@ -62,7 +62,7 @@ public function render($empty = FALSE) {
    -        '#text' => $this->tokenizeValue($this->options['content']['value']),
    +        '#text' => $this->tokenizeValue((string) $this->options['content']['value']),
    

    The area field in a view is filtered text, it can contain markup. I don't think we should be casting this to a string here, unless I've missed something about how this works?

  4. +++ b/core/tests/Drupal/Tests/Core/Utility/TokenTest.php
    @@ -294,4 +295,34 @@ public function providerTestReplaceEscaping() {
    +    $tokens = [
    +      '[site:name]' => 'Your <best> buys',
    +      '[site:slogan]' => Markup::Create('We are <b>best</b>'),
    +    ];
    

    Thanks for the test, it has enough coverage in my opinion. There's a side-effect here of locking in the string type of the site name and slogan but I don't think that should be an issue, I certainly don't expect those values to change. I think the test illustrates the issue well.

Review of draft change record:

  1. Please include a one-liner on why this is happening (double-escaping etc).
  2. Needs an "Impacts" heading and value.
darvanen’s picture

Comment by @andypost on #2578569: Move token sanitization from Token::replace() to Token::generate() that's more relevant here:

Tests should cover translatability of markup, ++ to new methods and deprecate old ones for D11

In D10 under PHP 8.0 it could use https://wiki.php.net/rfc/stringable which has polyfill in Synfony

I'm not sure how the token replacment method has any effect on translatability though.

darvanen’s picture

Just closed #2577845: Ensure that all core tokens return SafeStringInterface if needed as a duplicate of this issue . Happy to be wrong, but I'm pretty certain about that one.

I think #2578193: How to deal with invalid URLs which are result of token replacements is solved by the approach we're taking here and would like to also close as duplicate but am less sure. Thoughts over on that ticket please?

AdamPS’s picture

Thanks @darvanen for a thorough review. All changes done, except where commented below.

#102.1

This will now need to reference 9.4.0 and 11.0.0 instead.

Done for now. @catch said to @larowlan that we should commit to 9.4.x then after we can discuss a backport - in which case we can change it back.

#102.3

The area field in a view is filtered text, it can contain markup. I don't think we should be casting this to a string here, unless I've missed something about how this works?

I agree, it can contain markup. The tokenizeValue() function is a bit hacky as it is ambiguous whether the value contains unsafe HTML or markup. The new code casts it the input string to force it to unsafe, hence the result is also unsafe - which is fine because the processed_text render element sanitises the text. This is exactly the same situation as the old code, where the result is unsafe.

If we want to keep the deprecation this is the only solution I see (short of a major rewrite). It would be the same for anywhere in contrib where the input value is ambiguous. Casting to string is OK, provided that you treat the result as unsafe markup.

#103

I'm not sure how the token replacement method has any effect on translatability though.

Yes I agree. The only affect that I can see is that token replacement forces the translation to take place by casting $text to string - whereas otherwise the translation might wait until rendering. Anyway this issue hasn't changed anything relating to translation.

#104

Just closed #2577845: Ensure that all core tokens return SafeStringInterface if needed as a duplicate of this issue

I commented on that issue.

Status: Needs review » Needs work

The last submitted patch, 105: token-plain-text-2580723-105.patch, failed testing. View results

larowlan’s picture

Title: Unclear documentation and incorrect usage of Token::replace() » Split Token::replace into ::replaceMarkup() and ::replacePlain() and deprecated ::replace()

Updating title as this is more than just invalid documentation

darvanen’s picture

Status: Needs work » Reviewed & tested by the community

#105: That all makes sense, thanks in particular for spelling out the answer to 102.3.

As I suspected, the failure was a random one not to do with this patch which is now passing.

I can't see anything outstanding so let's see how we go.

AdamPS’s picture

Title: Split Token::replace into ::replaceMarkup() and ::replacePlain() and deprecated ::replace() » Split Token::replace(), adding new functions into ::replaceMarkup() and ::replacePlain()
Status: Reviewed & tested by the community » Needs review

Thanks both. I agree with the concept of changing the title - I tweaked it a little for accuracy as ::replace() is not deprecated entirely, only for specific cases.

AdamPS’s picture

Title: Split Token::replace(), adding new functions into ::replaceMarkup() and ::replacePlain() » Split Token::replace(), adding new functions ::replaceMarkup() and ::replacePlain()
darvanen’s picture

Status: Needs review » Reviewed & tested by the community

I don’t think you need to change it to needs review for a title change 🙂

AdamPS’s picture

Oops sorry my browser had cached the "Needs review" status and had set it back.

alexpott’s picture

Status: Reviewed & tested by the community » Needs work
  1. +++ b/core/lib/Drupal/Core/Utility/Token.php
    @@ -174,15 +174,83 @@ public function __construct(ModuleHandlerInterface $module_handler, CacheBackend
    +    if ($unsafe instanceof MarkupInterface) {
    +      @trigger_error('Passing MarkupInterface to Token::replace() is deprecated in drupal:9.4.0 and is removed from drupal:11.0.0. Use replaceMarkup() or replacePlain() instead. See https://www.drupal.org/node/3232491', E_USER_DEPRECATED);
    +    }
    

    I think we should change this to string type check. I.e. if (!is_string($unsafe)) {... We should also deprecate passing NULLs to this method because it makes no sense.

  2. +++ b/core/lib/Drupal/Core/Utility/Token.php
    @@ -174,15 +174,83 @@ public function __construct(ModuleHandlerInterface $module_handler, CacheBackend
    +   * @return \Drupal\Component\Render\MarkupInterface
    +   *   The entered HTML markup with tokens replaced.
    +   */
    +  public function replaceMarkup(MarkupInterface $markup, array $data = [], array $options = [], BubbleableMetadata $bubbleable_metadata = NULL) {
    

    Should have a return type hint.

  3. +++ b/core/lib/Drupal/Core/Utility/Token.php
    @@ -174,15 +174,83 @@ public function __construct(ModuleHandlerInterface $module_handler, CacheBackend
    +   * @return string
    +   *   The entered plain text with tokens replaced.
    +   */
    +  public function replacePlain(string $plain, array $data = [], array $options = [], BubbleableMetadata $bubbleable_metadata = NULL) {
    

    Should have a return type hint.

  4. +++ b/core/lib/Drupal/Core/Utility/Token.php
    @@ -174,15 +174,83 @@ public function __construct(ModuleHandlerInterface $module_handler, CacheBackend
    +   * @param string|\Drupal\Component\Render\MarkupInterface|null $text
    +   *   A string containing replaceable tokens.
    

    I don't think we should document NULLs here if at present they are allowed. This should be deprecated.

  5. +++ b/core/lib/Drupal/Core/Utility/Token.php
    @@ -174,15 +174,83 @@ public function __construct(ModuleHandlerInterface $module_handler, CacheBackend
    +   * @return string
    +   *   The token result is the entered HTML text with tokens replaced. In case
    +   *   of unsafe markup, the caller is responsible for choosing the right
    +   *   sanitization. If the result is intended to be used as plain text, using
    +   *   PlainTextOutput::renderFromHtml() is recommended.
    +   */
    +  protected function doReplace(bool $markup, $text, array $data, array $options, BubbleableMetadata $bubbleable_metadata = NULL) {
    

    Should have a return type hint. (This is for new methods only)

  6. +++ b/core/modules/locale/tests/src/Kernel/LocaleStringIsSafeTest.php
    @@ -75,18 +78,23 @@ public function testLocalizedTokenizedString() {
    +      $this->assertTrue($tokenized_string instanceof MarkupInterface);
    

    This assert is not necessary once you have return typehints.

  7. +++ b/core/tests/Drupal/Tests/Core/Utility/TokenTest.php
    @@ -294,4 +295,34 @@ public function providerTestReplaceEscaping() {
    +    $this->assertFalse($plain instanceof MarkupInterface);
    

    This assert is not necessary once you have return typehints.

  8. I think the test added should be split into two... testReplaceMarkup() and testReplacePlain
AdamPS’s picture

Thanks @alexpott. Updated patch contains all except 1&4. I did try deprecating the NULL and I recall there were lots of problems. I'll create a second patch after to reveal them again.

AdamPS’s picture

AdamPS’s picture

FileSize
19.8 KB
1.61 KB

Interesting - the is_string() test of course excludes NULL, so this means the deprecation of NULL is fine.

New patch fixes the deprecation notice to be consistent with the revised code, which is also clearer in the case that NULL is passed in. The comment already specified @param string so it should be pretty self-explanatory.

AdamPS’s picture

As far as I know this is now ready for commit many thanks.

AdamPS’s picture

OK I found a problem with the change proposed in #113.5.

The function doReplace() will return NULL in case of replace(NULL), which is deprecated but still allowed until 11.0.0. This fails to meet the new type hint.

I propose that the neatest fix is to change doReplace() so that the parameter is explicitly string. MarkupInterface will automatically convert. Then we add a string cast in replace() in case of the NULL input.

alexpott’s picture

Status: Needs review » Needs work

@AdamPS nice catch... it feels like we should have test coverage of passing a NULL in.

+++ b/core/lib/Drupal/Core/Utility/Token.php
@@ -174,15 +174,83 @@ public function __construct(ModuleHandlerInterface $module_handler, CacheBackend
+      @trigger_error('Passing a non-string parameter to Token::replace() is deprecated in drupal:9.4.0 and is removed from drupal:11.0.0. Use replaceMarkup() or replacePlain() instead. See https://www.drupal.org/node/3232491', E_USER_DEPRECATED);

This needs test coverage. Using $this->expectDeprecation().

AdamPS’s picture

catch’s picture

+++ b/core/modules/locale/tests/src/Kernel/LocaleStringIsSafeTest.php
@@ -54,19 +55,21 @@ public function testLocalizedTokenizedString() {
       1 => [
         'original' => 'Go to the <a href="[locale_test:security_test1]">frontpage</a>',
-        'replaced' => 'Go to the &lt;a href=&quot;javascript:alert(&amp;#039;Mooooh!&amp;#039;);&quot;&gt;frontpage&lt;/a&gt;',
+        'replaced' => 'Go to the <a href="javascript:alert(&#039;Mooooh!&#039;);">frontpage</a>',
       ],
       2 => [
         'original' => 'Hello <strong>[locale_test:security_test2]</strong>!',
-        'replaced' => 'Hello &lt;strong&gt;&amp;lt;script&amp;gt;alert(&amp;#039;Mooooh!&amp;#039;);&amp;lt;/script&amp;gt;&lt;/strong&gt;!',
+        'replaced' => 'Hello <strong>&lt;script&gt;alert(&#039;Mooooh!&#039;);&lt;/script&gt;</strong>!',
       ],
     ];

Given this test was apparently wrong before, I think it would be worth some comments here explaining what we're testing, why things end up escaped how they are etc.

AdamPS’s picture

larowlan’s picture

Priority: Normal » Major
Status: Needs review » Reviewed & tested by the community
Issue tags: +Bug Smash Initiative

I think all of the remaining feedback has been addressed here

Updated issue credits

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 122: token-plain-text-2580723-122.patch, failed testing. View results

AdamPS’s picture

Status: Needs work » Reviewed & tested by the community

Unrelated test failure, now working again

effulgentsia’s picture

Status: Reviewed & tested by the community » Needs review

Overall, I think this patch is great!

However,

+++ b/core/modules/system/system.module
@@ -1091,7 +1090,7 @@ function system_mail($key, &$message, $params) {
-  $subject = PlainTextOutput::renderFromHtml($token_service->replace($context['subject'], $context));
+  $subject = $token_service->replacePlain($context['subject'], $context);

While I think the proposed change is the desired behavior, is it a BC break that it's changing existing behavior? Am I correct that the BC version of this would be:

$subject = $token_service->replacePlain(PlainTextOutput::renderFromHtml($context['subject']), $context);

Suppose for example, some module wanted to send an email with a subject of: What do you think of the <blink> tag?. Given HEAD's current behavior, it would have to set the subject to: What do you think of the &lt;blink&gt; tag?, so that the existing renderFromHtml() preserves and decodes it.

I think what we're saying in this issue is that that's a bug: that given that the semantics of an email subject is that it is plain text, that the caller should be able to express that as What do you think of the <blink> tag? without having to escape it, and I agree with that. However, what do we do about the fact that there might be callers that are escaping it, because that's what was required to make it work with Drupal's existing behavior? If we commit this patch as-is, then for those callers that were escaping it to workaround what is arguably a bug, we'll start sending emails without decoding those escaped entities.

I wonder if in this issue, if we should change this and the other usages to the backwards compatible invocation (where we make core's callers to replacePlain() also explicitly renderFromHtml() the first argument too), with todos linking to issues to remove that renderFromHtml() call, and that way we can figure out if we need to deal with BC considerations in those respective issues on a case by case basis given the nature of each usage, rather than holding up this patch on figuring that out.

Or would we rather try to figure out the BC implications of each of those callers here rather than punting that?

AdamPS’s picture

Thanks @effulgentsia for raising this, it is a good question to ask and very well explained.

It's true, this fix will 'break' any sites that have compensated for the bug and are now relying on the bugged behaviour. However I think the same could be said for a fair proportion of bug fixes.

The problem with the "BC" version of the fix that you propose is that I think it doesn't actually fix the bug😃. What do you think of the <blink> tag would still be corrupted to What do you think of the tag.

I think the only thing we can do here is to add a warning to the change record, which I have now done.

effulgentsia’s picture

What do you think of the <blink> tag would still be corrupted to What do you think of the tag.

Hm, I don't understand why. Is there something else in our mail system that messes with the email subject after HEAD's existing PlainTextOutput::renderFromHtml() call is made? I'll try to make some time early this week to test that out to find out.

In the meantime,

+++ b/core/lib/Drupal/Core/Utility/Token.php
@@ -174,15 +174,83 @@ public function __construct(ModuleHandlerInterface $module_handler, CacheBackend
+  public function replaceMarkup(MarkupInterface $markup, array $data = [], array $options = [], BubbleableMetadata $bubbleable_metadata = NULL): MarkupInterface {
+    $result = $this->doReplace(TRUE, $markup, $data, $options, $bubbleable_metadata);
+    // All replacements were escaped and the input string is safe so the return
+    // string is also safe.
+    return Markup::create($result);
+  }
...
+++ b/core/modules/locale/tests/src/Kernel/LocaleStringIsSafeTest.php
@@ -45,28 +46,37 @@ public function testLocaleStringIsSafe() {
+    // - Render through TWIG, which should not alter the safe markup.
+    // - NB We never ran Xss::filterAdmin() so the unsafe protocol
+    //   'javascript:' remains.
     $tests_to_do = [
       1 => [
         'original' => 'Go to the <a href="[locale_test:security_test1]">frontpage</a>',
-        'replaced' => 'Go to the &lt;a href=&quot;javascript:alert(&amp;#039;Mooooh!&amp;#039;);&quot;&gt;frontpage&lt;/a&gt;',
+        'replaced' => 'Go to the <a href="javascript:alert(&#039;Mooooh!&#039;);">frontpage</a>',

This doesn't seem good at all. Perhaps I need to read the comment thread that led to this decision? I think this clearly shows that just because you have a sanitized token template and sanitized token values, does not mean that you end up with an output string that is safe against XSS. So I don't think that it's correct to create a Markup object that says it is. Why not run it through Xss::filterAdmin() first before doing that?

AdamPS’s picture

Hm, I don't understand why

Well that's what PlainTextOutput::renderFromHtml() does:

$ drupal php
Psy Shell v0.11.1 (PHP 7.4.3 — cli) by Justin Hileman
>>> \Drupal\Component\Render\PlainTextOutput::renderFromHtml('What do you think of the <blink> tag?');
=> "What do you think of the  tag?"
This doesn't seem good at all. Perhaps I need to read the comment thread that led to this decision? ... Why not run it through Xss::filterAdmin() first before doing that?.

I believe the comment trail will explain why the HTML escaping was removed from the replaced values like this:

  1. A key point of this issue is that escaping corrupts valid markup that resembles HTML
  2. To solve this bug, we need to stop escaping and the input text is already safe so doesn't need to be escaped again.

It probably won't say anything more. I don't think anyone has yet considered the scenario in the way that you have just done. And I see what you mean, this definitely needs careful checking.

However I propose that the security problem actually comes from LocaleStringIsSafeTest.

The original string Go to the <a href="[locale_test:security_test1]">frontpage</a> is unsafe, as it is effectively a link to an arbitrary value. The good news is that Xss::filterAdmin() would strip out the token inside the link. This happens because [locale_test is not a secure protocol - the colon in a token (whether by accident or by design) neatly triggers the XSS filtering rule.

I propose that the test code has a horrible security bug when it comes to this line - it misuses the t() function to mark something safe when it's not.

      // Pass the original string to the t() function to get it marked as safe.
      $safe_string = t($original_string);

The call to t() resolves to constructing a FormattableMarkup, and the comment for the relevant parameter says this

   * @param string $string
   *   A string containing placeholders. The string itself will not be escaped,
   *   any unsafe content must be in $args and inserted via placeholders.

Therefore I believe that this patch is not yet proven to have a security bug, if used correctly. Probably we should change the test to stop being such a bad example. However there is clearly a risk - there might be code that is currently using the code incorrectly, but is currently safe because of the escaping.

Calling Xss::filterAdmin() is good because it makes the markup safe without the corruption from HTML escaping. However it could create a new problem: what about code that is deliberately creating HTML containing script tags based on input from a trusted source? I just tested and the "Full HTML" text format could generate such a thing and then the site could enable the Token Filter module. It seems that adding a call to Xss::filterAdmin() could break this site.

Tricky. Thoughts welcome at this point.....

AdamPS’s picture

If the attacker could control both the original string and the token value, then I have an example of how to combine safe markup to get unsafe markup so it is a real problem (I won't post it publicly).

It's worth adding at this point that these problems are not really created by this patch. The same challenge exists in contrib modules that want token replacement to work properly with the existing interface. The example in the CR illustrates it exactly - the two lines of code below have the same risks as each other

Markup::create($token_service->replace($markup));
$token_service->replaceMarkup($markup);

And if you don't call Markup::create() but instead rely on auto-escaping, then you get double-escaping.

Maybe we could add the call to Xss::filterAdmin() but have an option to disable it??

darvanen’s picture

What if we had a ::replaceTrustedMarkup?

AdamPS’s picture

After thinking more, I feel that Xss::filterAdmin() is the solution. It means that this issue not only solves some bugs but helps improve security.

What if we had a ::replaceTrustedMarkup?

That's a good idea. On balance, my current inclination is to leave it out of this commit:

  1. It's quite difficult to explain to developers what it would mean to be "trusted". Even if the $markup has been created by a 'trusted' admin user, this person likely doesn't understand the pitfalls of allowing token replacement and how to avoid them.
  2. The actual use-case where we need to avoid calling Xss::filterAdmin() seems fairly unusual/obscure.
  3. There is a workaround, which is to cast the MarkupInterface to string then call ::replace().

New patch coming up...

AdamPS’s picture

Thank you @effulgentsia for your diligence to spot this problem. Here is a new patch for review and I also updated the CR.

effulgentsia’s picture

Thanks!

Thinking of how to address #130 - #132...

It's already the case that the Markup class is documented as saying that it is internal and should only be used by the render subsystem. The token system isn't the render system, so should not be creating a Markup object. It looks like we already have some pre-existing callers in HEAD, such as system_requirements(), violating this rule, so that's not great, but perhaps we should follow the rule for this patch regardless.

In which case, the intended design of MarkupInterface is that when a new system needs it, it creates its own implementation class that can document and/or implement its own special needs. For example, we have FilteredMarkup, which only documents its own context, but does not implement anything special, and we also have TranslatableMarkup, which implements a bunch of unique things.

So for this patch, what if we create a Drupal\Core\Utility\TokenReplacedMarkup class (better name suggestions welcome), where Token::replaceMarkup() constructs a TokenReplacedMarkup object with the raw result of Token::doReplace(), and where TokenReplacedMarkup::__toString() returns that string filtered by Xss::filterAdmin(), and where we add a TokenReplacedMarkup::getUnfiltered() method (better name suggestions welcome) to return the unfiltered string?

Given #132.3, maybe we don't really need the getUnfiltered() method, but if creating a TokenReplacedMarkup class is a good idea anyway, then why not add that method too?

alexpott’s picture

A great this about having a TokenReplacedMarkup is that we can do the XSS filtering only on __toString() and therefore will only do the work if the thing is used. And yeah I think having a get unfiltered method would seem like a useful thing if well documented that the caller becomes completely responsible for santisation.

AdamPS’s picture

OK here you go. The patch keeps stretching😃 would be great to get this one in.

AdamPS’s picture

effulgentsia’s picture

Hm, I don't understand why. Is there something else in our mail system that messes with the email subject after HEAD's existing PlainTextOutput::renderFromHtml() call is made? I'll try to make some time early this week to test that out to find out.

Yes, turns out there is. system_mail() converts from html to text the first time. And then MailManager::doMail() takes what has already been converted to text, treats that as html and converts it to text a second time.

I'm not sure what that means yet in terms of the BC break explained in #126. I need to think about that a bit more.

AdamPS’s picture

And then MailManager::doMail() takes what has already been converted to text, treats that as html and converts it to text a second time.

That is #3174760: Mails resembling HTML are corrupted which I have also been working on and has an RTBC patch.

As far as I can see we need a fix to both this issue and the MailManager one before it is possible to send a mail with the subject What do you think of the <blink> tag. Wouldn't any single call to PlainTextOutput::renderFromHtml() corrupt this title?

AdamPS’s picture

I am aware I am keen to get this issue committed but actually I realise that @effulgentsia you are quite right to be cautious here.

I have a suggestion. How about if we split this issue into 2, and create a follow-on for part 2?

2) Core code sometimes doesn't follow the interface correctly

So we would keep the changes to Token.php, new file TokenReplacedMarkup.php and the change in Drupal\views\Plugin\views\area\Text.php which is needed to avoid deprecation warnings.

This hopefully leaves the issue in a commitable state, and is probably easier for people to understand - the contents of the patch are a better match with the current issue summary and the title of the CR "New alternatives to Token::replace()"

AdamPS’s picture

Title: Split Token::replace(), adding new functions ::replaceMarkup() and ::replacePlain() » Fix token system confusion, with new function Token::replacePlain()
Priority: Major » Normal
Issue summary: View changes
FileSize
6.47 KB

This issue has been a learning experience and voyage of discovery for me😃. I made some key assumptions that turned out to be wrong.

1) That substituting safe tokens in safe markup is safe.

replaceMarkup() was created to give that assumption a place to live without affecting existing code. It occurred to me this morning that seeing as this assumption is false, then there is probably no point in replaceMarkup(). The existing replace() function can take either safe or unsafe markup and it doesn't matter because the result is the same: unsafe markup.

2) That the result of replacing tokens ought to be MarkupInterface. In fact the return comment of replace() although partly wrong is also partly correct:

The caller is responsible for choosing the right sanitization, for example the result can be put into #markup, in which case it would be sanitized by Xss::filterAdmin().

It seems the same as hundreds of other places where a string contains unsafe markup - there are various correct ways to handle it.

My confusion mostly came from my attempts to send HTML emails reliably through the venerable Drupal mailsystem, as a maintainer of simplenews and swiftmailer. I found many bugs with corruption from extra escaping or extra rendering to plain making an indiscriminate vendetta on any < symbols that might legitimately occur😃. I've given up that fight and I'm now heading in a new direction with the Symfony Mailer project.

Here's my new proposal for this issue:

  1. I have split of a separate issue #3264453: Incorrect usage of Token::replace() can corrupt plain text that resembles HTML.
  2. I have removed replaceMarkup() from this issue.
  3. The remaining patch is simple, safe and hopefully easy to commit.
darvanen’s picture

Voyage of discovery indeed, for me too.

Things always get more complex before they simplify again, or so I've heard ;)

If for some reason we decide not to implement replacePlain the documentation and variable name changes now here are vital in my opinion.

AdamPS’s picture

Thanks @darvanen.

If for some reason we decide not to implement replacePlain the documentation and variable name changes now here are vital in my opinion.

Thank you for your support of the documentation and variable name changes.

I am hoping that we would implement replacePlain()😃. It's a brand new function so no BC-risk, and it should make it easier for people to use the interface correctly.

larowlan’s picture

  1. +++ b/core/lib/Drupal/Core/Utility/Token.php
    @@ -175,14 +174,61 @@ public function __construct(ModuleHandlerInterface $module_handler, CacheBackend
    -  public function replace($text, array $data = [], array $options = [], BubbleableMetadata $bubbleable_metadata = NULL) {
    +  public function replace($markup, array $data = [], array $options = [], BubbleableMetadata $bubbleable_metadata = NULL) {
    

    Confirming that this isn't a BC concern for sub-classes
    https://3v4l.org/GS5RY

  2. +++ b/core/lib/Drupal/Core/Utility/Token.php
    @@ -199,9 +245,15 @@ public function replace($text, array $data = [], array $options = [], Bubbleable
    +        $replacements[$token] = $value instanceof MarkupInterface ? PlainTextOutput::renderFromHtml($value) : $value;
    

    Any reason why this can't just use PlainTextOutput::renderFromHtml regardless?

AdamPS’s picture

Thanks @larowlan

Any reason why this can't just use PlainTextOutput::renderFromHtml regardless?

We don't because that leads to corruption of plain text that resembles HTML.

Psy Shell v0.11.2 (PHP 7.4.3 — cli) by Justin Hileman
>>> Drupal\Component\Render\PlainTextOutput::renderFromHtml('What do you think of the <blink> tag?')
=> "What do you think of the  tag?"
>>> Drupal\Component\Render\PlainTextOutput::renderFromHtml('In the case that a<b<c we can deduce ...')
=> "In the case that a"

That's exactly the sort of bug that this issue is trying to avoid, and that we are fixing in the follow-on issue #3264453: Incorrect usage of Token::replace() can corrupt plain text that resembles HTML. In general, we need to avoid rendering something that is not markup, or escaping something that is.

larowlan’s picture

Thanks, I figured that would be the case.

Can we expand the docs a bit on renderPlain to point this out?

That aside this looks greatly simplified and hence far less disruptive

AdamPS’s picture

Thanks

Can we expand the docs a bit on renderPlain to point this out?

Good idea - done

larowlan’s picture

Status: Needs review » Reviewed & tested by the community

This looks much simpler now, thanks for your perseverance here @AdamPS

roy_k’s picture

I tried patch #147. It didnt work and even more I got an error on my site:
the site encountered an error...

I had to undo it. Using D9.3.8 with php 7.4

Solved - If you want to use the entity title and you get the title wrap with HTML, use [node:field_brand:entity] token and it will give you the entity title without the HTML wrap.

larowlan’s picture

@roy_k

Can you elaborate
- what you expected to happen
- what actually happened
- what the error message was

At present your feedback is missing those details so we can't really provide any insight

roy_k’s picture

@larowlan,

Thank you but I found the solution and put it in my comment so anyone else who had the same problem can find it and use it.

I tried to use the entity title as a file name but when saving the node it created a folder by the title name wrap with HTML tag and not just the entity name. It seems that the wrong token was used and it was not clear which one to use to get it.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

alexpott’s picture

Version: 9.5.x-dev » 9.4.x-dev
Status: Reviewed & tested by the community » Fixed

Committed and pushed 74b25722ac to 10.0.x and 6a2a3a4ea9 to 9.5.x and a4cd4cf82b to 9.4.x. Thanks!

I wish this contained some of the fixes so the only use-case was not a test - but getting this in is more important so committing.

  • alexpott committed 74b2572 on 10.0.x
    Issue #2580723 by AdamPS, Berdir, andypost, darvanen, larowlan, alexpott...

  • alexpott committed 6a2a3a4 on 9.5.x
    Issue #2580723 by AdamPS, Berdir, andypost, darvanen, larowlan, alexpott...

  • alexpott committed a4cd4cf on 9.4.x
    Issue #2580723 by AdamPS, Berdir, andypost, darvanen, larowlan, alexpott...
geek-merlin’s picture

Such an important fix.

🥂!

AdamPS’s picture

Great many thanks @alexpott, @larowlan and everyone else who helped.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.