Problem/Motivation
SafeString/SafeStringInterface describes a string that is safe for use as an HTML node/fragment. This is now (or will be soon) the return value of t() and SafeMarkup::format().
However, it is not uncommon to want to use those strings in e-mail titles or HTML attributes - such as select list options.
Drupal 7 via !placeholder allowed you to create a string with no HTML escaping - this meant that some t() calls essentially returned plain text rather than HTML, specifically for e-mail subjects but this can open sites up to XSS when they're used elsewhere.
Drupal 8 with SafeString can similarly open up sites to XSS, given that the SafeString itself won't be HTML escaped when used in an HTML attribute. The result of t() and SafeMarkup are always HTML. When you manually create a SafeString it may have been XSS fitlered or Html::escape()d and we don't know.
Then on top of this, it's easy to get double escaping.
We need to consistently treat both the first argument and return values of t() and SafeMarkup::format() as HTML, so that we can translate that HTML to other formats like plain text, retaining the original meaning of the string.
Proposed resolution
Create a simple interface with a method to for HTML to text strategy classes
This is basically just a way to document the correct way to take the HTML string returned from translation wrapper and convert it to plain text.
pros: this will work for other functions that return HTML, and is consistent with how we handle e-mail bodies. Also it would handle any markup added by translators.
cons: it's not obvious why you'd take what might potentially have been a plain text string in the first place, return it 'as HTML', then convert it back to plain text again. But we will just have to document that the "transport protocol" we use for transporting strings through drupal (whether they are HTML or not) is HTML.
Remaining tasks
Reviews
User interface changes
None
API changes
None
Data model changes
None
| Comment | File | Size | Author |
|---|---|---|---|
| #166 | 2509218-165.patch | 5.61 KB | jhedstrom |
| #161 | 2509218-161.patch | 5.8 KB | jhedstrom |
Comments
Comment #2
effulgentsia commentedComment #4
effulgentsia commentedComment #5
xjm+1 for "deprecating" the option this way.
It's interesting that the patch is green when my Views patch has a failure though -- didn't figure out how that happened yet. :)
I think we need to keep documentation of this indicating it is a legacy placeholder type that now behaves the same as @ but is deprecated in 8.0.x and that support for it will be removed before 9.0.0.
Comment #6
effulgentsia commented#2558791: "!"-prefixed tokens should Xss::filterAdmin() but not affect safeness takes a different (and IMO, better) approach to the
!semantics. But I still think the 'html' => FALSE option for t() is worth adding, so retitling.Comment #7
joelpittetCould this be a bit more generic so it's inline with other strategies in twig? Doesn't have to be exact
'is_safe' => ['html']But the idea is that you can make it safe for different contexts.
['js', 'html']or['all'], what do you think?https://github.com/twigphp/Twig/blob/5fdbd991bfcf5ea2492c9ab074a2d2cde18...
https://github.com/twigphp/Twig/blob/5fdbd991bfcf5ea2492c9ab074a2d2cde18...
Also I have tests over here #2531824: Attribute class to check safe strings before escaping (has tests) for a bug that looks like it's partially addressed by this patch for feed_icon.
Comment #8
alexpottI think given #2557113: Make t() return a TranslationWrapper object to remove reliance on a static, unpredictable safe list we have another option here. If
t()returns aTranslationWrapperthen we can add a method on it to do this. See patch.Comment #10
stefan.r commentedI like the idea in #8, it would solve part of the concern in #2558791: "!"-prefixed tokens should Xss::filterAdmin() but not affect safeness
s/translate/translateForNonHtml/
Should we be more forceful about how dangerous this is if it ends up in HTML (or javascript)?
Maybe just "wrapped in
<em>" as opposed to themed?Just an idea, but if we're worried about developers not thinking when they use this, maybe a 'strip_tags' key in the options array that defaults to TRUE...
Should we list a few example use cases here, i.e. error logs and plain text email messages?
newline
Comment #11
stefan.r commentedIf we have a string that's intended for non-HTML output only this muddles the meaning of the placeholder prefixes a bit as both @, ! and any other prefix would output the unescaped/unfiltered string... but I guess there's no way around that when we have a TranslationWrapper that has a single translatable string that may be output to both HTML and non-HTML.
Comment #12
effulgentsia commentedJust a reroll.
Comment #13
effulgentsia commentedPer #2506427-29: [meta] !placeholder causes strings to be escaped and makes the sanitization API harder to understand, restoring this issue to its original scope.
Comment #18
subhojit777Reducing the number of failing tests
Comment #19
subhojit777I have just updated the tests since now
!will behave similar to@.Comment #22
joelpittetWe need to move #2506445-170: Replace !placeholder with @placeholder in t() and format_string() for non-URLs in tests here. This is the diff:
These are for email tokens.
Comment #23
xjmI think we should probably postpone this on #2557113: Make t() return a TranslationWrapper object to remove reliance on a static, unpredictable safe list at this point, since that will change this patch.
Comment #24
xjmComment #25
xjmAlso, I think we should actually separate the two scopes of this issue. "Add new API for t() to work for non-HTML text" can mean simply using the new behavior from #2557113: Make t() return a TranslationWrapper object to remove reliance on a static, unpredictable safe list to allow @placeholder to be used for emails since they would then be escaped at the same time as other strings, which will improve the DX overall by simplifying things.
However, deprecating
!placeholderand making it behave like@placeholderis much broader in scope and does not have consensus. We may actually remove it instead, and anyway fixing the test failures for that change will duplicate efforts on #2506445: Replace !placeholder with @placeholder in t() and format_string() for non-URLs in tests and friends.Can we create a separate issue for that if we still think it's worth doing (as opposed to just removing them all)?
Comment #26
xjmComment #27
xjmAlso, there are some incorrect hunks in the current patch.
Comment #28
effulgentsia commentedIn #18, the working assumption was that once #2557113: Make t() return a TranslationWrapper object to remove reliance on a static, unpredictable safe list lands, the API for getting a translated string for non-HTML use (such as email subject or JSON output) would be something like:
with that method name still open for discussion (other options might be
->toPlainText(), etc.).On a hangout earlier today with a bunch of people, @catch brought up another idea:
where that toText() method could call MailFormatHelper::htmlToText() (or we move that implementation into the Html class).
A benefit of that is that it could be the same API for when we want to convert the HTML output of something other than t() to plain-text, such as:
A drawback might be that if you look at that current implementation of MailFormatHelper::htmlToText(), maybe that's a lot of specific choices on what to do that don't really make sense for strings coming out of t()? Then again, maybe they do?
Comment #29
effulgentsia commentedAlso, raising this to Critical, because it's impossible to remove the last usages of
!placeholderfrom core, such as:without it.
Comment #30
xjmOne thing #28 does not mention is the risk of adding "
!placeholderby another name".Also, I'm confused that all the suggestions in #28 involve adding methods. I thought that the idea was that converting to late rendering of the translations would mean that the placeholders could be escaped then, in the render process, so they wouldn't be escaped otherwise, same as the expectation for Twig.
Comment #31
dawehnerWell yeah at some point though you need to convert the object to a string. Maybe we could also go with
TranslationWrapper->render($strategy)Comment #32
catchRe-titled and updated issue summary to try to summarize the two approaches.
Also if we go for option #2, this wouldn't need to be postponed, so moving back to active for the discussion at least.
Comment #33
plachNot sure whether this is BS, since I just started to get my feet wet with the
SafeMarkupstuff, but it seems to me that here we are needing an alternative sanitization logic. If we were able to instantiate a different sanitization service depending on the mime type of the output, we could use placeholders semantically and let each service figure out the most appropriate sanitization strategy. We could default totext/htmlas sanitization context and allow to specify alternative ones via$options. For example:Comment #34
dawehnerWell, the problem is that like for tokens, the place which generates the t() cannot know yet, how its gonna be used, so making it lazy would help with that.
@plach
I think we should do something, that is as parallel as twig, which means you would pass a sanitization strategy?
Comment #35
catchSo that's true for t() strings that end up in e-mail bodies - but we (should) already use MailHelper::htmlToText() for that.
I'm not sure it's true for e-mail subjects though - otherwise those places wouldn't be explicitly using the !placeholder to avoid sanitization.
i.e.
$message['subject'] .= t('[!site-name] !subject', $variables, $options);from contact.moduleComment #36
dawehnerWell, IMHO for email subjects we should be able to strip all tags.
Comment #37
plach@dawehner:
On one hand we cannot have reliable output sanitization without knowing the output format, OTOH I realize that code generating token might miss the required contextual information, at least currently.
I think we'd have two ways to cope with this:
$options, as we are doing for language right now. How we'd implement that practically I'm not sure about: maybe a user could choose between[title]and[title:plain]or stuff like that.DynamicStringWrapperthat could be passed around until contextual information about output format is finally available. This would allow us to have something like the following:A token value could simply be wrapped into a child
TokenStringWrapperhaving simply@valueas hard-coded string pattern.In this scenario
@valueand:urlplaceholders would determine the value's semantics instead of the sanitization logic: the former would indicate any plain string, while the second would indicate a URL string. This way, depending on the output format, we would know whether escaping the value or sanitizing the URL protocol is needed, for instance.I think this is not enough: if the string was "HTML-escaped" previously, stripping tags would have no effect and lots of bogus
<and>could creep into the mail subject.Comment #38
catchJust discussed this with plach, I'd thought that #2569485: Add AttributeSafeStringInterface and UriAttributeSafeStringInterface was a competing approach to this, but actually I think we should use both.
You get a
$translated_stringobject.The object has a ->renderAsHtmlAttribute() method.
This strips tags (to avoid
<em>tags either being invalidly not escaped, or uglily escaped in an HTML attribute.It also encodes entities to avoid XSS.
And returns an AttributeSafeStringInterface.
You can then pass an AttributeSafeStringInterface into AttributeString, so that it doesn't end up getting run through Html::escape() when it's already in the right format.
If we only did renderAsPlainText() then that might be OK for an e-mail subject but it's not necessarily OK for an HTML attribute.
If we only do AttributeSafeStringInterface then you can still end up with either escaped or unescaped HTML tags in attributes.
If we do both all cases are covered and it's clear what should be used for what.
Comment #39
plachI'll experiment a bit with this, although it would be better to get #2557113: Make t() return a TranslationWrapper object to remove reliance on a static, unpredictable safe list in first.
Comment #40
plachHere is a first stab, just to get an idea of how things could look like. This includes also #2557113-185: Make t() return a TranslationWrapper object to remove reliance on a static, unpredictable safe list, the interdiff is the new code.
Comment #43
catchSo right now the first argument to t() is literal HTML and does not get escaped or filtered again (since it's marked as a SafeString). We also allow translated strings to have HTML in them.
We should probably make that more explicit than it currently is in the documentation, https://api.drupal.org/api/drupal/core%21includes%21bootstrap.inc/functi... doesn't really make it clear that the first argument is HTML at all, except saying don't put variables in there.
Given in this issue we're talking about three different formats of returned string - HTML + plain text + HTML attribute, we need to figure out what that looks like.
Let's say we start with the following string:
1. HTML:
Nothing happens, you get this back:
Which will look like
in the browser (i.e. a select option or the title tag of an image or whatever).
2. Plain text:
plach suggested html_entity_decode(strip_tags($string)); The other option would be factoring out https://api.drupal.org/api/drupal/core!lib!Drupal!Core!Mail!MailFormatHe...
That would return:
The <em> tag makes your text look like "this".Fine for e-mail subject lines and similar.
3. HTML attribute.
For this, I think we'd want to take the plain text string, then Html::escape() it, so:
Html::escape(html_entity_decode(strip_tags($string)));
That gets us:
'The <em> tag makes your text look like "this".Which will look like:
In the browser.
Having typed that out makes me wonder the following (assuming the above is what we want, it might not be):
If we can get a plain text output from t(), then we can pass that plain text to Attributes, and AttributeString would Html::escape() it - and maybe that's enough of an API for passing the results of t() to attributes (which Views currently does).
Comment #44
catchThen that same example with arguments:
1. HTML:
First argument is escaped, second argument is marked as a SafeString so is not escaped. Return is correct HTML.
2. Plain text:
First argument is not escaped- this is fine.
Second argument is ... we could strip_tags() safe string arguments but erggh.
I think it'd be easier to get the HTML string first, then apply the same html_entity_decode(strip_tags($string)) to that. Gets us the same result regardless of whether the HTML is in the string or replacements then.
3. Attributes, as before we just Html::escape() the plain text value.
Comment #45
stefan.r commentedHmm, not sure this is secure for all attributes, we may still need some special cases for other attributes such as on* then, and think about what to do about attributes that are not wrapped in double quotes.
The Html::escape(html_entity_decode(strip_tags($string))); might be fine for email subjects but the htmlToText helper is much nicer for longer text, I also quite like some of the things in https://github.com/soundasleep/html2text that we could be doing in htmlToText too.
Comment #46
catchSo at the moment we just don't support passing user-entered content to on* (and never should). But if we wanted to provide a way to encode them properly I think we need a new issue for that - would need to be applied to Attributes and Xss::attributes() too.
Yes I think that's an open question.
Comment #47
effulgentsia commentedI haven't thought through #43 and its related comments yet, so this is not a commentary on that, but just want to respond to this part from the issue summary:
I don't see this as a con, because IMO the "another name" part is enough to address the "makes the sanitization API harder to understand" part of the parent issue. Here's what I mean:
In Drupal 7,
!can be used to signify many different things, among which are:Note that with each of the above cases, the calling code in question might be incorrect in its assumptions (each item below maps to the same number above), each mistake resulting in an XSS vulnerability:
The current implementation of
!in Drupal 8 HEAD solves nicely for the first two cases, whether you're right or wrong. If you're right, you get exactly what you asked for, and the resulting string isn't marked safe and doesn't need to be for those contexts. If you're wrong, you end up getting the escaping of the entire string, which is exactly what you should get when a string you thought you were returning for non-HTML use gets used in HTML.The current implementation of
!in HEAD also solves nicely for the 3rd case if you're right, but causes escaping of the entire t() output (not just the incorrect placeholder) if you're wrong. And the current implementation breaks down even more for the last two cases, regardless of if your assumptions are right or wrong.The problems with cases 3-5 are the reasons to remove
!entirely. But not arguments against solving just cases 1 and 2 with an API that's clear about that reduced scope.Comment #48
catchSo for me #2 is the tricky one, which #43 attempts to unpick, and I think plach's patch is compatible with what's in #43.
The problem being is that the output of t(), with the same strings and context, could be used as either an attribute value or an HTML fragment, certainly the Views info stuff is like that - and those need two different strategies.
I'd also add an example #6, which is the reverse problem of #1:
6. You thought you were creating a string for drupal_set_message() (or any HTML output), but someone put it into the subject of an e-mail instead.
#43 allows us to handle that case too.
Comment #49
plachThis implements #43 - #44 and provides unit tests for that. I will start looking at test failures as soon as #2557113: Make t() return a TranslationWrapper object to remove reliance on a static, unpredictable safe list is committed, since I don't want to waste time chasing every iteration. Most of the failures are unit tests that just need to be adapted to the new API.
Comment #50
plachDone for tonight
Comment #51
plachHere's a version of #49 including only the parts added here, except for some small bits in
TranslationInterfaceandTranslationManager.Comment #54
plachAdded missing PHP docs
Comment #55
subhojit777Comment #56
plachRebased on top of #2557113-225: Make t() return a TranslationWrapper object to remove reliance on a static, unpredictable safe list.
Comment #59
almaudoh commentedI really like this approach. Re-uploaded the review.txt patch in #56 since #2557113: Make t() return a TranslationWrapper object to remove reliance on a static, unpredictable safe list is now in.
Working on a service / plugin manager for replacing placeholders based on content type.
Comment #60
almaudoh commentedComment #61
almaudoh commentedIn this patch...
1. Added a plugin manager with
@OutputFormatterannotation to manage the different kinds of output formatters.2. Moved the three output formatters to the Plugin/OutputFormatter directory
Needs tests for the new classes.
Comment #64
pwolanin commentedThis patch looks like it's off track in terms of scope and what it's doing. Let's make it as small as possible - ideally just doc and no API changes
Comment #65
plachThanks, some copy/paste issues:
Comment #66
plachI'm not sure what API changes you are referring to, can you expand on that? Also, I don't really think we are going to address this issue just with documentation.
Comment #67
pwolanin commentedPlease update the issue summary before posting any more patches.
I can't even tell if this is going in the right direction. I think adding more complexity to t() and the rest of the API here might be the wrong thing.
From the issue summary I was hoping this would be mostly a documentation issue - instead it's looking light a significant API change.
Comment #68
pwolanin commented@plach - option #2 in the current issue summary suggests it can mostly be a docs issue. It's not clear to me when/if that option was discarded.
Comment #69
plachSorry, we are currently going a different way, see #33 and #37, I will update the issue summary ASAP.
Comment #70
pwolanin commentedDiscussing with alexpott at MOB. We need a conversation in person about broadening this into to a generic (but as simple as possible) system to allow us to render any SafeString with a context-relevant formatter.
Comment #71
catchThis is still critical, and it blocks #2571673: Convert Views t() usage where it is used as an attribute value, which is also critical, even if you don't personally like the solution.
The problem is very much there, and does not 'need more info'.
Comment #72
stefan.r commentedComment #73
catchDiscussed more with pwolanin and alexpott at the sprint.
I don't think we should use plugins for this, @alexpott suggested just a method that takes a formatter and returns the string in the format, that seems plenty for extensibility to me. It'll be an interface with one method.
There's still a bit of discussion as to whether when we format for an attribute, do we return a plain text string that we then escape again, or return an escaped string and communicate to AttributeString and elsewhere that it doesn't get escaped again (via AttributeFormattedStringInterface or whatever which we don't have yet). However hopefully this clarifies exactly what the need is with the updated issue summary.
Comment #74
catchComment #75
catchComment #76
catchComment #77
wim leersI read through the entire issue, and had the same remarks/questions/doubts about using plugins as #73. But catch posted it 10 minutes before I did :P
@pwolanin and @catch confirmed that we are now going with option 3 in the issue summary. That's why he tried to strike through the rest in #76, but failed miserably, because
<del>is not able to strike through block-level elements :PComment #78
pwolanin commentedreally more like #2
Comment #79
pwolanin commentedreally more like #2
Comment #80
pwolanin commentedAfter further discussion just having an interface and classes with a static method is the simplest possible way to solve this problem.
Here's a starting patch showing the general idea.
Comment #81
pwolanin commentedComment #82
pwolanin commentedComment #83
dawehnerAdded some tests
Comment #84
plach@pwolanin @dawehner:
Can we restore the test cases using placeholders that were introduced in #49?
Comment #85
plachWorking on this
Missing FQCN
Missing newline
Comment #86
plachMore tests and docs
Comment #88
plachFixed test failures.
Comment #89
plachProbably this interface should provide more details about what output strategies are, when to use them and how.
Comment #90
catchWould be good to avoid 'to be used' twice in the sentecne.
"Implements an output strategy used to format strings for HTML attribute values."?
We can skip the Html::escape() - Attributes/AttributeString will do that. I think that's what we decided this afternoon.
Even if we don't have the separate output strategy for attributes vs. plain text, we could have an integration test with AttributeString to ensure it looks right after escaping?
Comment #91
effulgentsia commented+1 to the general approach here.
Re #90.2: if we keep
HtmlAttributeValueOutputin this patch at all, let's add a @todo for it to do the escaping AND upcast it to a AttributeSafeStringInterface once #2569485: Add AttributeSafeStringInterface and UriAttributeSafeStringInterface is in.Should we also add a
FormattedPlainTextOutputstrategy (or better name if someone comes up with one) that invokes MailFormatHelper::htmlToText()? And if we do, then should we renamePlainTextOutputtoSimplePlainTextOutputor similar name, or is it ok for that one to not have any qualifying prefix?Comment #92
stefan.r commented+1 to #91 - I had discussed this with @plach earlier today and the conclusion was a "fancier" version of the plain text output for larger text could be a nice non-critical followup here. SimplePlainTextOutput and FormattedPlainTextOutput sound great to me
Comment #93
almaudoh commentedSome docs nits mostly...
Contains \Drupal\Component\Utility\HtmlAttributeValueOutput
"or a any" :)
OutputStrategyInterface doesn't really explain for me what this does. Maybe OutputEscapeStrategyInterface or OutputFormatStrategyInterface. Or perhaps just OutputFormatInterface.
So what's the plan for implementation of these on SafeStrings? The IS is not very clear on this. Are we adding a new
:: renderAsFormat(OutputStrategyInterface $strategy)method to SafeStringInterface...?Comment #94
dawehnerSmal changes here and there.
Comment #96
plachNot working on this atm...
Comment #97
lauriii+1 for the SimplePlainTextOutput to make it possible to create more plaintext output strategies
Comment #99
pwolanin commentedI think all of these just return a simple string, so I don't think we should return a HtmlAttributeValueOutput or anything else for any of these.
Comment #100
pwolanin commentedI put this here originally, but actually we can accept any object that implements __toString
Not sure how to note that in the @param
Typo here, but I also think the comment isn't right - we may use it with AttributeSafeStringInterface but every class implementing this interface should return a string.
Why not just create a SafeString object? I don't see the value of a mock here.
Comment #101
stefan.r commentedMaking some further changes
Comment #103
stefan.r commentedComment #106
pwolanin commentedI'm going to try to fix the test fails now.
Comment #107
almaudoh commentedshould be
SimplePlainTextOutputComment #108
pwolanin commentedmis-named classes used in the code.
Comment #110
stefan.r commentedThe PlainTextSimple / PlainTextFormatted was deliberate as SimplePlainText seemed more confusing but I don't care much about one or the other. If we're going to go back to SimplePlainText let's rather mention FormattedPlainText in the @todo as well.
I think we'll have to revert the SafeString change as SafeString is in Core\Render - which we're not supposed to refer to in Component?
Comment #111
pwolanin commentedSilly me - the DrupalComponentTest fails if you do that. Back to using mocks, plus fix another class name use.
Comment #112
dawehnerI'm still convinced by the comment about unsanizited. If you pass in something like t('Foo-
Do we need to come up with the email given that we already have
\Drupal\Core\Mail\MailFormatHelper::htmlToTextComment #113
stefan.r commented@dawehner yes, 1 is confusing, let me see about rewriting that.
As to 2, earlier in the issue a PlainTextFormattedOutput option came up (to be added in a followup). As far as I can see this would be no different from MailFormatHelper, so we could just move the logic from MailFormatHelper to PlainTextFormattedOutput as the formatted plain text could be used in more contexts than just email.
Comment #114
dawehnerDo you think this is needed as part of this issue?
Comment #119
stefan.r commentedI think this should be a followup. Created #2573009: Provide a PlainTextFormattedOutput output strategy.
Comment #120
stefan.r commentedComment #121
star-szrSome thoughts for now:
Didn't we say select lists are special from other attributes?
I'd also say there are use cases for including tags in a select list, for example if you are choosing different HTML tags for output of a field.
Minor: Blank line needed above docblock.
Minor: I think these
i.e.should both bee.g.,.What does long texts mean here?
Comment #122
imiksuFixed minors (#121.2 and #121.3).
Comment #123
almaudoh commentedRe #107 #108, #110: Sorry, didn't know there had been a decision to change the names. Patch looks good.
Comment #124
dawehnerGiven our previous discussion with @stefan.r I think we should add explicit UI test coverage for select form elements with quotes in there and HTML just to see what we exactly should do.
Comment #125
stefan.r commentedI'm actually not sure about options elements being special.
<option><</option>in a select does /not/ double escape for me.Comment #126
stefan.r commentedComment #127
stefan.r commentedDiscussed this patch with @catch earlier today and we didn't see the need of escaping HTML attributes - merely turning them into plain text is enough. I do think it makes sense to have a dedicated class for them though, even if they only wrap the plain text one.
If select elements really are special let's address them in a followup as I don't think they're blocking any criticals whereas this one is. Let's get this patch in?
Comment #130
catchSelect options in a followup is fine with me.
Comment #131
pwolanin commentedI don't understand the last change. The help text says "Use this when rendering a given HTML string into an HTML attribute value"
If people are putting it into an attribute value (or if we wire this up to a Twig filter) it needs to be escaped.
Comment #132
pwolanin commentedComment #133
pwolanin commentedComment #134
stefan.r commented@pwolanin I had discussed this with @catch and the idea was not to do any sanitization in these output strategies and arrange this with Twig instead. Which would need updated docs if that's what we want to do.
Alternatively maybe we could sanitize and then mark as safe (for output into attributes).
Comment #135
stefan.r commentedre #132 why would we need to escape when outputting into a plain text context? It'd be for output into (non-URL/style/on*) attributes in any case, right?
And wouldn't that attribute value get double escaped whenever Twig has a go at that string?
Comment #138
pwolanin commented@stefan.r - in the twig context I was assuming you'd use this as a filter, not let Twig do the default autoescaping.
Comment #139
stefan.r commented@pwolanin OK can you double check with @catch what we want to do here then?
Comment #140
plach@pwolanin
Regardless of what we decide, plain text should not be HTML-escaped. I think the change was applied to the wrong strategy class.
Comment #141
pwolanin commentedOh, huh - yes. I'm a little tired
Comment #142
catchIf we let Twig don't we have to add knowledge about the strategy to AttributeString since that's what we use mostly.
I don't think we should do that here. The Views case where t() is put into an attribute can do plain text then put that string into AttributeString which does the escaping.
Anything beyond that is major followup for me not release blocking. And we shouldn't add any strategies we can't use in core at all here either.
Comment #143
plachOk, removing the HTML attribute strategy altogether. We can add it back later if needed.
Comment #144
plachAlso improved docs a bit.
Comment #145
stefan.r commentedComment #146
lauriiiNit: into fits the previous line
Why was this @todo removed in #88?
Comment #147
plachOn this
Comment #148
plach@lauriii
Because we changed approach: we no longer automatically apply the output strategy from the
::render()method, instead we apply it to its return value.Comment #149
lauriiiThis looks good for me now :) Thanks @plach!
Comment #150
plachI think we should be done here.
Comment #151
plachWorking on a CR
Comment #152
almaudoh commentedRTBC++
Comment #153
plachCR at https://www.drupal.org/node/2574697
Discussed with @alexpott and we agreed
PlainTextOutputis a better name for the strategy we are adding here,PlainTextFormattedOutputis still a good name for the upcoming advanced strategy.Comment #154
plachAnd now with patch!
Comment #155
plachCreated #2574723: Figure out whether we need a dedicated output strategy for select elements.
Comment #159
jhedstromFails on bot are
PHP Fatal error: Uncaught exception 'ReflectionException' with message 'Class Drupal\Tests\Component\Utility\PlainTextOutputTest does not exist'due to a mismatch between class and filename.
Comment #160
jhedstromWorking on #159, and whatever else @alexpott finds during review.
Comment #161
jhedstromQuick rename so tests can run while review proceeds.
Comment #162
wim leersPlainTextFormattedOutputmakes no sense to me.How can something be both plain and formatted? Contrast with the text field types we have:
— you must choose, you can't have both.
Can somebody enlighten me?
Comment #163
alexpottI'm too am confused by this comment. I thought this patch would give as the tools necessary to do #2572597: Replace !placeholder with @placeholder in mail code. Tbh I don't get why we have to do this - surely
MailFormatHelper::htmlToText()just does what it does and this is great.Comment #164
wim leerss/plain-text/plain text/
s/santization/sanitization/
What is "simple" plain text? Isn't plain text always "simple"?
I'm betting this is related to the "formatted plain text", but that doesn't mean this is clear. I think it's fine to omit.
Discussed with @jhedstrom, and now #162 is answered. I still think the name doesn't make sense, but we can discuss that further/refine it in the follow-up issue. I will comment there in a minute.
s/array/string[]/
s/none/the empty array/
EDIT: to be clear, I fixed all my nits.
Comment #165
wim leersOk, per #163, we need to deal with the mail thing here.
I think
PlainTextFormattedOutputis an extremely confusing name. So let's try to find something better. This is the relevant code:So it is transforming HTML into plain text, while preserving the HTML structure and just mapping it to a syntax defined in RFC 3676. That RFC is titled .
So it's actually transforming
text/htmltotext/plain.Suggested names based on this so far:
StructuredPlainTextOutput,MailPlainTextOutput,PlainTextMailOutput.Second, let's look at that RFC for inspiration:
Suggested names based on this:
PreformattedPlainTextOutput,MonospacedPlainTextOutput.Conclusion: pick one of these names:
StructuredPlainTextOutputMailPlainTextOutputPlainTextMailOutputPreformattedPlainTextOutputMonospacedPlainTextOutputI think
PlainTextMailOutputis the best one.Comment #166
jhedstromDiscussed with @stefan.r and @alexpott, and decided the
@todocould be removed entirely, and that this is rtbc assuming it goes green.Comment #167
wim leersEh, ok. Can we document here why?
I've transplanted my commment at #165 to #2573009-3: Provide a PlainTextFormattedOutput output strategy.
Comment #168
jhedstromre #167 the reasoning is that the new class isn't strictly needed as part of this fix, so an @todo is premature, but the new class can still be discussed in #2573009: Provide a PlainTextFormattedOutput output strategy.
Comment #169
alexpottThanks everyone - this looks a great solution. Committed 70bad3e and pushed to 8.0.x. Thanks!
Comment #175
berdirToo slow old testbot, too slow.
Comment #176
plach:)
Comment #177
sunThis change attempts to introduce a custom concept of "output strategies", not a utility. Why was the code added to Utility?
Comment #178
plach@sun:
hey :)
My original patch was providing an
OutputFormatternamespace, @pwolanin asked for a simplification of the approach and since bothHtmlandSafeMarkuplive inUtility, that felt like a good place also forPlainTextOutput.Btw, I think it would useful if
SafeMarkupimplementedOutputStrategyInterface, providing a::renderFromHtml()method implementation just casting its input to string. That would allow to pass it to methods receiving any output strategy as input.