Problem/Motivation

It is pretty common to want to use a token for a URL when editing text. For instance, you might want something like:

<a href="[my:token:here]">link text here</a>

However, if you try to do this in a body field that is using CKEditor, with a text format that uses the "Limit allowed HTML tags and correct faulty HTML" filter, if you enter a token like that in the field, and then later edit the content that contains that field and either look at the HTML source or save the content, the token is corrupted. This seems to be independent of which CKEditor buttons you have configured, so it isn't a problem (apparently) with a particular button plugin.

Additionally, some (but not all) other attributes in HTML get corrupted in the same way, if your text format allows these attributes. Some examples that were tested:

Tag Attribute Corrupted?
p class Corrupted
img src Corrupted
img alt Not corrupted
a href Corrupted

Tokens in the text that are outside of HTML attributes do not get corrupted.

Steps to reproduce:
a) In a content item with a field that has a text format that is configured to use CKEditor for editing, and which contains the "Limit allowed HTML tags and correct faulty HTML" filter, and allows the A tag, type some link text in the editor.
b) Highlight to select the link text.
c) Click the Link button (chains) in the editor toolbar, and enter [my:token:here] as the URL in the popup.
d) Click Save in the popup. Verify that the HTML source looks like

<a href="[my:token:here]">link text here</a>

e) Save the content item you are editing.
f) Test -- the link works fine (assuming you are running a token replace so it gets replaced by the right URL).
g) Edit the content item again.
h) When you get back to the editor, look at the HTML source. Instead of seeing what was there before, you will see something like this:

<a href="en:here]">link text here</a>

So that's the bug: if you re-edit some HTML text using CKEditor and the "Limit allowed HTML tags and correct faulty HTML" filter, and there is an A tag with a token in the URL (or tokens in various other attributes, but not all attributes), CKEditor truncates and mangles the token, leading to data loss. According to Priority Levels of Issues, this means it is a Critical bug (or at least Major?) because it leads to data loss.

Note: We are specifically seeing this in the proposed Help Topics module (see related issue #2943974: Work-around for route tokens in Help text format getting truncated after editing a help topic).

Proposed resolution

Fix Drupal so that CKEditor doesn't mangle tokens in URLs in A tags.

The problem was traced down to a bug in \Drupal\Component\Utility\Xss::attributes(). If you pass in an attribute string like href="[something:something:config_basic]" to this function, you get out something that looks like href="config_basic]".

Remaining tasks

Fix the bug in Xss::attributes().

User interface changes

CKEditor will not mangle HTML containing tokens for URLs in A tags, or other HTML tag attributes.

API changes

None.

Data model changes

None.

Issue fork drupal-2944173

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

jhodgdon created an issue. See original summary.

jhodgdon’s picture

Oh, you can see our CKEditor configuration here
https://cgit.drupalcode.org/sandbox-jhodgdon-2369943/tree/config/optiona...

It seems we are using the DrupalLink and DrupalUnLink plugins, so most likely the problem is either there or in the core CKEditor itself?

jhodgdon’s picture

Priority: Normal » Critical
Issue summary: View changes

Actually, this bug causes data loss/corruption with no warning to the user and no work-around. Merely having the CKEditor editor enabled on a field that contains a URL token, and updating a content item that contains that field, will cause that token to be truncated.

That means this is a critical bug, according to https://www.drupal.org/node/45111

So... updating priority and updating issue summary to add template.

andypost’s picture

Looks that's exactly ckeditor bug, probably this regexp in action https://github.com/ckeditor/ckeditor-dev/blob/major/core/htmldataprocess...

webchick’s picture

Is there an upstream bug report?

jhodgdon’s picture

Not yet. Have we definitely confirmed that it is a bug in ckeditor code, independent of Drupal?

Another way to look at this: Is using strings similar to Drupal tokens in URLs something that ckeditor should support and/or not mangle, or should Drupal's link plugin possibly override the mangling that ckeditor is doing? I mean, ckeditor mangles (or, to use more neutral language: edits, fixes, and otherwise modifies) various things in HTML code that it receives. Maybe it has a reason for not leaving href attributes of A links alone if they look like Drupal tokens?

jhodgdon’s picture

I'll just also note that as I mentioned in comment #2, Drupal's CKEditor module provides Link and Unlink plugins, so we are not using the raw ckeditor project code in this case. And Drupal has various other modifications and additions to the base ckeditor project code... So without debugging, it would not necessarily be clear that this bug comes from the ckeditor project and not the Drupal CKEditor module.

That's in addition to the question of whether, even if it does come from something in the ckeditor project code, whether it is a bug for purposes of the ckeditor project, or whether it's just a bug in the context of Drupal, which might want to put strings that look like tokens into URLs.

webchick’s picture

Oh, I might've misread #4 as attributing it to a bug in the upstream regex.

jhodgdon’s picture

He said "probably" it was this regexp in action in #4.

I'm not sure I understand how that regexp would mangle token-like strings, anyway? Or again, whether that's a bug or a feature from core ckeditor perspective?

jhodgdon’s picture

Issue summary: View changes

I tested this some more today, and have some further information.

First off, the text format / editor config I am using has the Drupal "Limit allowed HTML tags and correct faulty HTML" filter on it. So, in the settings for that filter, I enabled more attributes on various allowed tags, used the Source button to put attributes on those tags in formats that looked like tokens, and found that CKEditor (and/or whatever Drupal is doing) is mangling some attributes but not all. Here is a list of ones I tested:
- p with "class" attribute ==> mangled
- img src ==> mangled
- img alt ==> not mangled
- a href ===> mangled

So, I turned off the "Limit allowed and correct faulty HTML" filter in my text format config, so there are no filters but I'm using CKEditor. Now, my attributes are not mangled. So... probably this is not a core ckeditor project bug.

So ... Let's see, what happens if I leave that filter in but turn off CKEditor? Answer: no mangling of attributes in the textarea.

I also tried removing various buttons from my CKEditor config. Actually, I removed all of them except the Source button. My attributes were still mangled.

So. It looks like the problem is limited to happening when I am using a text format and editor config that:
- Includes the Drupal "Limit allowed HTML tags and correct faulty HTML" filter
- Uses CKEditor
- It doesn't seem to matter which buttons I have enabled.

Updating the summary... Anyway, I think the problem is in what Drupa's CKEditor module does with that "Limit allowed HTML tags and correct faulty HTML" text filter, not with CKEditor core code itself...

jhodgdon’s picture

Component: ckeditor.module » base system
Issue summary: View changes

OK, I have narrowed down the problem: The mangling is happening in function editor_filter_xss() in the editor.module file.

Right at the end, where it does:

  $editor_xss_filter_class = '\Drupal\editor\EditorXssFilter\Standard';
  \Drupal::moduleHandler()->alter('editor_xss_filter', $editor_xss_filter_class, $format, $original_format);

  return call_user_func($editor_xss_filter_class . '::filterXss', $html, $format, $original_format);

Just before that call, the $html value is fine, but the return value is mangled.

To make a long debugging story shorter... The problem is actually in the function
Drupal\Component\Utility\Xss::attributes()

If you pass in something like href="[something:something:config_basic]" to this function, you get out something that looks like href="config_basic]"

So, I am moving this to component base system, because the problem is in the XSS utility class. Ooops.

jhodgdon’s picture

Title: CKEditor mangles tokens in URLs » CKEditor mangles tokens in URLs, due to bug in Xss::attributes()
les lim’s picture

I think this is the same bug as #2544110: XSS attribute filtering is inconsistent and strips valid attributes - the attributes method assumes ":" characters are an attempt to use a dangerous schema such as "javascript:", and sanitizes the string by stripping everything before the colon.

jhodgdon’s picture

Status: Active » Needs review
StatusFileSize
new663 bytes
new645 bytes

Here is a patch to the XssTest that illustrates the problem. This is just adding a data set to be tested, with elements: input, expected output, message for the test assertion, list of tags that are allowed.

If I change the test to look like this:

      // Token-like attributes.
      [
        '<img src="[something:like:a:token]">',
        '<img src="token]">',
        'Token-like attributes, no change',
        ['img'],
      ],

then it passes -- in other words, the src="[something:like:a:token]" attribute is mangled to look like src="token]". Here, I'll attach that test too so you can see that it passes. :)

Now, we only need to fix the actual bug, and make the (valid) test pass.

jhodgdon’s picture

This is possibly the same bug as this related issue (thanks @les_lim for pointing out)... I will add a comment there.

jhodgdon’s picture

I just tested the patch from the other issue and it doesn't fix the token problem from this issue, so I think they are related but different issues.

jhodgdon’s picture

And by the way, the test should probably be expanded. That attributes() function on the Xss class has all kinds of special cases for various attributes, so we should probably test more than just the one that I put in the test.

The last submitted patch, 14: 2944173-test-only-FAIL.patch, failed testing. View results

les lim’s picture

Is it possible to expand tokens *before* the attribute sanitization takes place, I wonder?

jhodgdon’s picture

Not in the case described in this issue. The token mangling is happening in the HTML that is in a textarea field on a content editing screen, which is being prepared for editing with CKEditor. You need to put the token into the HTML you are editing, and you don't want the token replace to happen until the content is being rendered.

jhodgdon’s picture

The problem is that the Xss class thinks : is a dangerous character, and the token system thinks : is a good separator.

amber himes matz’s picture

Might this API page for function locale_string_is_safe provide any insight or help for this issue? It sounds very similar to the problem described here.

Note the comment header for the function locale_string_is_safe:

// Some strings have tokens in them. For tokens in the first part of href or
  // src HTML attributes, \Drupal\Component\Utility\Xss::filter() removes part
  // of the token, the part before the first colon.
  // \Drupal\Component\Utility\Xss::filter() assumes it could be an attempt to
  // inject javascript. When \Drupal\Component\Utility\Xss::filter() removes
  // part of tokens, it causes the string to not be translatable when it should
  // be translatable.
  // @see \Drupal\Tests\locale\Kernel\LocaleStringIsSafeTest::testLocaleStringIsSafe()
  //
  // We can recognize tokens since they are wrapped with brackets and are only
  // composed of alphanumeric characters, colon, underscore, and dashes. We can
  // be sure these strings are safe to strip out before the string is checked in
  // \Drupal\Component\Utility\Xss::filter() because no dangerous javascript
  // will match that pattern.
  //
  // Strings with tokens should not be assumed to be dangerous because even if
  // we evaluate them to be safe here, later replacing the token inside the
  // string will automatically mark it as unsafe as it is not the same string
  // anymore.
  //
  // @todo Do not strip out the token. Fix
  //   \Drupal\Component\Utility\Xss::filter() to not incorrectly alter the
  //   string. https://www.drupal.org/node/2372127
jhodgdon’s picture

Yes, that's very interesting! At the end of that comment, it references this issue:
(e) #2372127: Prevent Xss::filter from stripping a token at the beginning of an html attribute

That was closed by @catch as a duplicate of
(a) #2635632: Drupal XSS filtering cuts valid attributes, make it configurable / use a blacklist

And that issue also has several open related issues on it:
(b) #2105841: Xss filter() mangles image captions and title/alt/data attributes
(c) #2552837: XSS::filter and filter_xss can create malformed attributes when you would expect them to be stripped
(d) #2544110: XSS attribute filtering is inconsistent and strips valid attributes

[letters added to each issue for reference purposes, sorry about (e) being at the top, I didn't think I needed it until later in my comment]

So let's see, maybe our issue is a duplicate of one or more of those?

OK, (d) is the one we've been discussing that was brought up by @Les Lim. It's definitely related, but not an exact match, and its patch doesn't fix this issue. It looks to me as though (d) and (b) are quite probably duplicates of each other.

(c) looks like a completely different issue, but related because it's also about attributes.

(a) doesn't look like a duplicate of this issue either, except that (e) definitely was, and @catch marked (e) as a duplicate of (a).

So... I'll add a comment to (b) and (d) about them probably being duplicates of each other, and a comment to (a) about whether it's going to fix this issue or not. And add some more related issues here.

jhodgdon’s picture

See also #2635632-20: Drupal XSS filtering cuts valid attributes, make it configurable / use a blacklist, where I have tried to summarize all the mangled attribute issues and ask whether we should solve them together or apart.

catch’s picture

Based on the last time I looked at this, my preference would be to try to come up with an extensible (or at least configurable) solution in #2635632: Drupal XSS filtering cuts valid attributes, make it configurable / use a blacklist and mark all the other issues as duplicate - with maybe a follow-up to improve the default list that core uses.

jhodgdon’s picture

If that issue summary gets updated to say it's also about preserving tokens in any attribute, and that issue's priority is appropriate, that is fine with me.

Currently, that issue has no summary, so it is difficult to tell what the approach is or what "configurable/blacklist" would mean. Also, it is only Major, and I think this issue is Critical (see issue summary -- leads to data loss if you are using tokens in many HTML tag attributes, and CKEditor).

But sure, if that other issue will resolve this problem (we could take the test patch here and add it there, probably with some more examples in other attributes?), then I'm all for combining.

jhodgdon’s picture

I took the one test here, expanded it a bit, and made a test-only patch on #2635632-21: Drupal XSS filtering cuts valid attributes, make it configurable / use a blacklist (which I then expanded further in the next comment to incorporate tests from two other related issues).

If that test-only patch is adopted, and the issue summary there is updated, I'm definitely comfortable closing this as a duplicate... I think the approach of an attribute blacklist/whitelist, even if configurable, is not going to be the right approach though. Tokens can be anywhere, and hopefully they aren't an XSS vector. If they are, we should rethink how they are formatted rather than trying to enumerate all the places where it is safe to use them and where it isn't.

catch’s picture

If we're going to look at changing token formatting, that should stay its own issue I think. I'd like to have one issue for the Xss::attributes() bug itself though.

jhodgdon’s picture

Well, the question is whether the current token formatting using : is an XSS attack vector. If we're somehow going to fix Xss::attributes() so that it will let tokens through, no matter where they are, and it will still protect from XSS attacks, then we don't need to add an alternate way to format tokens. Right?

jhodgdon’s picture

Status: Needs review » Active

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

ayalon’s picture

Is there a solution for this problem? There is so much text an links but I have exactly this bug and I'm looking for a patch.

jhodgdon’s picture

There is not a viable solution yet.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

bramdriesen’s picture

So any movement on those sub issues? =) bumped into the same issue today, possibly related to this one: #2903336: Allow tokens for url of the Link field

jhodgdon’s picture

Nothing has changed since the last update, as far as I know.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

tom.moffett’s picture

I ran into this same problem last night, and I did some debugging to come across the same problem you did. I guess my Google skills aren't as strong as I thought. I tried to see if there was an existing issue before going into the weeds of Drupal core debugging. What brought me here is that I thought I had a workaround, but it isn't satisfactory.

Same thing is happening to me when trying to use a token for the class attribute value on an a tag when I use a text editor type that has "Limit allowed HTML tags and correct faulty HTML" filter enabled (for example, <a class="[somegroup:sometoken]" ...></a>). I am using the https://www.drupal.org/project/token_filter module. I was able to get it to at least render in the site by putting the token replacement first in the text editor's "Enabled filters", but when I load a page for editing, the token gets mangled when the content is loaded in CKEditor. Doing a view-source in the browser on the edit page shows that the data-editor-value-original attribute on the editor textarea is correct with the proper token, but the actual content of the textarea isn't. So, it does, indeed, look like this is getting mangled before getting into CKEditor on the client-side JS. I debugged the rendering on the site part, but I haven't debugged this part. My educated guess is that the XSS utility is still mangling the content that ends up getting rendered in the CKEditor when trying to edit a page (and after re-reading the complete issue here, that definitely was mentioned and is part of this issue).

tom.moffett’s picture

StatusFileSize
new17.22 KB

For my specific case, I was able to come up with a complete workaround now. It involved creating a custom filter for the text editor, using the solution to this issue, https://www.drupal.org/node/2943974, as my inspiration. Basically, use | to separate the token instead of : since that doesn't get stripped by the XSS utility. The custom filter comes in so that you can transform the token to the correct format it needs to be in to be recognized for replacement.

I have my custom filter and then the token_filter module's filter as the first two items in the "Filter processing order". I haven't tested it, but this might still work if your processing order is different, but obviously, the custom filter would at least need to come some time before the token filter. However, if you don't want to process tokens until after the "Limit allowed HTML..." filter, then both the custom filter and the token filter should be after that one.
filter processing order workaround

Sorry, but I don't have time right now to share a clean version of the full feature, but hoping that the idea could point people in the right direction. To help you along, here is the process function from my custom filter:

public function process($text, $langcode) {
    //TJM: Replace tokens in our editor-safe format (just using '|' instead of ':') with their normal token format.
    $processedText = preg_replace_callback(
      '/\[(?:[\w.\-]+\|)+[\w.\-]+\]/',
      function ($matches) {
        return str_replace('|', ':', $matches[0]);
      },
      $text
    );
    
    if($processedText === NULL) {
      //TJM: Default to being non-destructive if the replacement call somehow returns an error.
      $processedText = $text;
    }

    return new FilterProcessResult($processedText);
  }

I also used the filter class from token_filter module as a guide to set up my custom filter since I was new to writing custom filters prior to this, but it was pretty easy. For reference, https://git.drupalcode.org/project/token_filter/blob/8.x-1.x/src/Plugin/.... You only really need the "process" function for your simple custom filter.

It is kind of annoying to have to do this and have this structure for tokens all over your content wherever you use them (could make maintenance annoying later on), but this issue hasn't been fixed in 2 years and I really wanted to be able to use tokens right now.

Using this approach allows the content to render properly on the site, but it also keeps the CKEditor content correct when you go to edit a page. In my previous comment, I explained how I was able to get the first part to work, but this custom filter completes the workaround and allows happy editing and usage of tokens!

firewaller’s picture

+1

dmitry.korhov’s picture

In addition to #39 a little more completed workaround example:
Place FilterTokenAttributes.php into my_module/src/Plugin/Filter folder with content:

<?php

namespace Drupal\my_module\Plugin\Filter;

use Drupal\filter\FilterProcessResult;
use Drupal\filter\Plugin\FilterBase;

/**
 * A filter that converts | in tokens into :.
 *
 * @Filter(
 *   id = "filter_token_links",
 *   title = @Translation("Token Attributes Filter"),
 *   description = @Translation("Use tokens in html attributes e.g. [site|name] instead of [site:name]."),
 *   type = Drupal\filter\Plugin\FilterInterface::TYPE_TRANSFORM_IRREVERSIBLE,
 * )
 */
class FilterTokenAttributes extends FilterBase {

  /**
   * {@inheritdoc}
   */
  public function process($text, $langcode) {
    $processedText = preg_replace_callback(
      '/\[(?:[\w.\-]+\|)+[\w.\-]+\]/',
      function ($matches) {
        return str_replace('|', ':', $matches[0]);
      },
      $text
    );
    return new FilterProcessResult($processedText ?? $text);
  }

}

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

fabianx’s picture

Ohhh - that is a very creative solution! :)

You can even use the same token format and just add two filters: One before and one after the filtered HTML filter:

- Convert tokens to use ! instead of :
- Filter HTML (like normal)
- Convert tokens to use : instead of !

That is an effective core work-around and as long as the token regexp is safe, also a safe approach.

This can then be put into a contrib module: token_xss

This can _also_ help with the colons in class problem.

tido’s picture

StatusFileSize
new808 bytes

We managed to correct this problem by testing directly the presence of brackets at the beginning and at the end of the chain to be tested in the Xss::attributes() function in order to bypass the call to UrlHelper::filterBadProtocol().

Tokens are now correctly rendered and are no longer mangled when editing the node.

tido’s picture

StatusFileSize
new801 bytes

Here is the fixed patch (tested with Drupal 8.9.x).

rolfmeijer’s picture

@tido The patch applies on 8.9.7 and basically it works. But there are issues when a token in a href is extended. For instance if you have a custom user orders-page at /user/[uid]/orders than href="[current-user:url:path]/orders" won’t work. It will strip of anything until the last colon, i.e. href="path]/orders". I’m not sure if this is a proper use case and has to be considered.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

aerzas’s picture

StatusFileSize
new800 bytes

Rerolling patch against Drupal 9.2.x.

aerzas’s picture

StatusFileSize
new821 bytes

Avoid notice on empty string.

markusd1984’s picture

StatusFileSize
new37.76 KB

@tom.moffett #39 and @dmitry.korhov #41, would that be possible via the custom filter module?

It has a field for the Pattern & Replacement text https://www.drupal.org/node/210553

Couldn't figure out if i just need to put into Pattern

/\[(?:[\w.\-]+\|)+[\w.\-]+\]/

and inside Replacement text "|" or would I need to utilise PHP Code perhaps to adopt/paste in yours?

custom filter safe token

Thanks for any help you can give, great solution I'd like to try in drupal 7.

scottcollier’s picture

StatusFileSize
new909 bytes

Patch against Drupal 8.9.x.

I think we only care that it starts with a token since we want to allow it to be a valid protocol.

So, all of the following should work:

<a href="[my:token:here]">link text here</a>
<a href="[my:token:here]/">link text here</a>
<a href="[my:token:here]/[user:id]/profile">link text here</a>
<a href="[my:token:here]/blog/[user:id]">link text here</a>

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

jhodgdon’s picture

This should also be considered: <a href="/foo/bar/[my:token:here]">link text here</a>

firewaller’s picture

StatusFileSize
new947 bytes

Re-rolled #51 against Drupal 9.2.x

kerasai’s picture

StatusFileSize
new743 bytes
new967 bytes

#51 requires the token to be the only thing present in the first "part" (split by /) of the URL. This doesn't work for my case, a URL such as <a href="[site:url]user/[commerce_subscription:uid:entity:uid]/orders">billing</a>.

Here is an adjustment to allow any URL that begins with a token to validate, patched against 8.9.x and I expect it to apply to 9.x branches.

Note, looking at the regex and the logic, this should allow the case noted in #53.

daggerhart’s picture

Status: Active » Needs review

#55 works for me. The token is maintained through multiple edits and loading of the content in the editor.

Regarding #53, My url has multiple tokens in it, and they all work as expected.

According to the issue description, fixing this replacement is the only task. I believe this is RTBC, but since I haven't seen anyone change the status of this ticket yet I'm only moving to Needs Review.

The last submitted patch, 48: 2944173-xss-attributes-48.patch, failed testing. View results

danflanagan8’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

Definitely can't RTBC this one without adding test coverage. Looks like #14 has a good start to the tests. There are some other cases that have been brought up in the comments since then though.

liquidcms’s picture

None of these patches apply to 9.1.11.

my url part of a link in editor is simply a single url token.. and on save it is replaced with "url]".

i was sure on an older project which was using 8.8.8, this worked correctly.

liquidcms’s picture

StatusFileSize
new984 bytes

hmm, i think #55 would work but i think the patch has a typo, the last line is this:

$thisval = $skip_protocol_filtering ? $match[1] : UrlHelper::filterBadProtocol($match[1]);

but pretty sure $thisval is wrong, and should just be $value. Once i change it to that, the token is not stripped and things work as expected.

corrected patch attached.

liquidcms’s picture

Ah, I see, they changed the variable name.. which is good, because why would anyone ever name a variable $thisval? lol.

danflanagan8’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests
StatusFileSize
new2.29 KB
new3.35 KB

Here's a fail test in the spirit of #14, but with many more cases gathered from other comments in the issue.

The (partial) fix is identical to the fix from #55 except that it can handle tokens in src in addition to href. I call it a partial fix because one of the test cases still fails.

NOTE: The case that is still failing is <img src="foo[my:token:here]/[another:token]">, which is getting changed to <img src="here]/[another:token]">. The output of the test is stripped of various elements. If you run the test locally you can see the true output.

Status: Needs review » Needs work

The last submitted patch, 63: 2944173-63.patch, failed testing. View results

danflanagan8’s picture

I'm starting to wonder if there's any way to allow tokens in src or href securely.

Before that though, note there's a typo in my patch in #63.

+++ b/core/lib/Drupal/Component/Utility/Xss.php
@@ -266,6 +266,18 @@ protected static function attributes($attributes) {
+            $attributes_allowing_tokens = ['src', 'html'];

This should be href, not html.

Back to security. Certainly the current approach (#55 or #63 with typo fixed) is concerning. If I enable the contrib token_filter module and add it to the standard Basic HTML format with the option Replace empty values selected, I can very easily do some XSS.

I just enter this into the CKEditor:

<a href="[fake:token]javascript:alert('gotcha!');">XSS Link</a>

That comes out the other side as

<a href="javascript:alert('gotcha!');">XSS Link</a>

And it could never be as simple as not allowing the string javascript: because I could do this:

<a href="[fake:token]java[fake:token]script:alert('gotcha!');">XSS Link</a>

Any thoughts anyone?

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

stephencamilo’s picture

Status: Needs work » Closed (won't fix)
danflanagan8’s picture

Status: Closed (won't fix) » Needs work

Restoring previous status

amber himes matz’s picture

Status: Needs work » Needs review
StatusFileSize
new3.35 KB

Here's a new patch that just fixes the typo in the patch from #63.

We still need discussion on the questions raised in #65.

Status: Needs review » Needs work

The last submitted patch, 69: 2944173-typofix-69.patch, failed testing. View results

catch’s picture

This in turn is probably a duplicate of #2544110: XSS attribute filtering is inconsistent and strips valid attributes but I think we need to do a proper consolidation if so because there are competing approaches between the two issues.

amber himes matz’s picture

I dug into this issue and #2544110: XSS attribute filtering is inconsistent and strips valid attributes with the intention of verifying that they were duplicates. What I found was that they are both intent on modifying Drupal\Component\Utility\XSS's filtering in order to prevent XSS class from filtering out/mangling data from 2 different kinds of contexts:

  1. Tokens being used inside href or src attributes, e.g. <a href="[node:url:absolute]">link</a>
  2. in a body field that uses CKEditor and enables "Limit allowed HTML tags and correct faulty HTML" in the text format configuration.

  3. Valid attribute name/value combinations that provide RDF metadata, e.g. rel="schema:author" or property="foaf:name"

Right now, as per the scope described in each issue summary, they are not duplicates, but if the scope was widened to consider both use cases/contexts, they could be combined into 1 issue, ensuring a compatible solution. Considering that they're both dealing with XSS class, that might be desirable.

As it stands, the patches in each issue do not solve the other's problem:

  1. The patch in #2944173-69: CKEditor mangles tokens in URLs, due to bug in Xss::attributes() is saying, “hey XSS::attributes, allow tokens in this hardcoded safelist of attributes (src, href)“.
  2. The patch in #2544110-100: XSS attribute filtering is inconsistent and strips valid attributes is saying, “hey everyone in the XSS class, we’re gonna allow this safelist of attributes (alt, title) to bypass XSS::filter. And btw, protect these rdfaAttributes too.”

We do NOT want to allow href or src to bypass filtering altogether I would think! Which is what would happen if we used the attribute safelist approach in #2544110: XSS attribute filtering is inconsistent and strips valid attributes and just added in href and src. In this issue, we just want it to allow tokens and not mangle them up. Assuming that tokens aren’t an XSS vector (which is an open question for discussion in #65.

Could we combine these issues and their solutions? Maybe it would make sense to do so, to ensure a compatible solution that solves both problems (href, src token mangling, and RDFa Attribute handling in XSS class). We would need to increase the scope in the one that stays open. Is that worth doing? What do you think?

Final note, if this issue remained open, it would need an issue summary update, because the problem is still there (token mangling) but different in Drupal 9.4.x (the token is mangled on save, not after a re-edit only). This was mentioned in #59. But I am waiting to see what others think about combining the issues before doing so.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

amber himes matz’s picture

Looks like #2544110: XSS attribute filtering is inconsistent and strips valid attributes has been closed as a duplicate of a fixed issue. I've tested this issue again in 9.5.x and it's still a problem, although there are some minor differences in the results described in the issue summary.

I'm doing a bit more testing and plan to update the issue summary with some updated details and also some steps to test using Token Filter so that you can actually test with working tokens.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

jibran’s picture

RE #65: It is a very well-made point so thank you for that.

I feel like it shouldn't be \Drupal\Component\Utility\Xss::attributes() responsibility. If it can happen after token replacement using token_filter then I think it should be dealt with in token_filter. Once the token is replaced using token_filter then src and href should be passed to \Drupal\Component\Utility\UrlHelper::filterBadProtocol().

mandclu’s picture

I ran into this issue today, and had to implement a messy workaround. It would great if this could get more attention.

richardcapricorn’s picture

Hi all,

I used the 2 patch files provided by @danflanagan8. and turned it into a merge request.
I added the href property alongside the src and html property.

Any feedback is welcome

Best regards,
Richard Hoogstad

watergate changed the visibility of the branch 11.x to hidden.

drewcking’s picture

The patch of MR10819 works for me, thank you @richard_hoogstad.

acbramley made their first commit to this issue’s fork.

acbramley’s picture

Issue tags: +Bug Smash Initiative

Triaged as part of BSI.

Rebased the MR which was green but well behind HEAD.

Still NW based on an IS update requested in #76 as well as questioning the direction of the solution in #65

vitaliyb98 made their first commit to this issue’s fork.

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.