If an html title attribute of a link contains a colon (as in "a: b") only the part ofter the colon (" b") ends up in the html code. This is incorrect behavior.

This problem was reported in the WYSIWYG module bugs (http://drupal.org/node/878926), but I see the same behavior without that module.

Example input:
<a href="#" title="foo: bar">Text</a>
Generated HTML:
<p><a href="#" title=" bar">Text</a></p>

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

patrickdepinguin’s picture

markus_petrux pointed out to me that this behavior is a 'feature' in the filter module, related to cross-site scripting (see http://api.drupal.org/api/function/filter_xss_bad_protocol/6). Everything to the left of a colon is considered a protocol, and should be in the whitelist of protocols.

In my opinion, this checking is not needed (and incorrect) for attributes such as the title attribute of a link tag.

By the way, it is unclear to me to which component this bug relates, i.e. how the filter_xss_bad_protocol function is called).

dddave’s picture

Component: forms system » filter.module

I guess this is the right component.

sun’s picture

Title: Wrong behaviour if link title attribute contains colon (core) » String stripped from title attribute if contains a colon
Version: 6.19 » 7.x-dev
Issue tags: -filter_xss_bad_protocol, -title attribute
sun’s picture

Version: 7.x-dev » 8.x-dev
Priority: Normal » Minor

Sorry, but it's too late and we have much more important issues to deal with right now. A fix for this may be backported from D8.

chris.guitarte’s picture

Version: 8.x-dev » 6.22

Any word on the status of this? Just wondering how it sits in the queue. We're still seeing this behavior in 6.22 - but for the alt attribute.

Example Input:
<img alt="annen:ascj" src="/sites/default/files/AnnenbergA_75p.jpg" width="75" height="75" />

Example Output:
Example:
<img alt="ascj" src="/sites/default/files/AnnenbergA_75p.jpg" width="75" height="75" />

Thanks!
Chris

sun’s picture

Version: 6.22 » 8.x-dev
arski’s picture

yeah, there should be a whitelist of attributes for which this check is unnecessary I suppose :x

Hanno’s picture

alemadlei’s picture

Status: Active » Needs review
Issue tags: +dlatino, +drupalcr
FileSize
764 bytes

Based on the initial comment, I added a condition to check for attributes that include the colon as part of their name. After this, the stated test cases worked as expected.

This happens when namespaced attributes are being used.

Olafski’s picture

Hello, will a fix be backported to D7? There are quite common use cases for a colon in the title attribute, e.g. for adding copyright information to images.

mike booth’s picture

Status: Needs review » Needs work

Unless I am being clueless and misreading the patch at #9, it addresses the issue of attribute names with colons in them (which might help with e.g. #1328768: attributes 'xml:lang' and 'xml:id' transform to 'lang' and 'id' in filter_xss) but doesn't address attribute values with colons in them, which is the topic of this issue.

It does look as though Drupal's filter.module merrily calls filter_xss_bad_protocol() on every attribute value it ever sees, regardless of the attribute name or the tag type. The good thing about this is that it's maximally paranoid! The bad thing is that it zaps all colons from any attribute value, ever, unless the text before the colon just happens to match a whitelisted URI protocol.

One could fix this by whitelisting certain attribute names, like "title" and "alt" and... (any other suggestions?) Or one could do that more conservatively by whitelisting certain tag/attribute pairs - but that approach would appear to require more extensive refactoring in filter.module.

DevElCuy’s picture

In order to reproduce the bug, make sure that your "text format" is using "Limit allowed HTML tags " filter.

DevElCuy’s picture

#12 is not working any more, debugging.

Liam Morland’s picture

One could fix this by whitelisting certain attribute names, like "title" and "alt" and... (any other suggestions?) Or one could do that more conservatively by whitelisting certain tag/attribute pairs - but that approach would appear to require more extensive refactoring in filter.module.

This filtering should only be done on attributes that can actually be a URL. A whitelist of attributes that are plaintext is probaby a good way to do this.

iStryker’s picture

Whitelist from https://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%29_Preventio...

Safe HTML Attributes include: align, alink, alt, bgcolor, border, cellpadding, cellspacing, class, color, cols, colspan, coords, dir, face, height, hspace, ismap, lang, marginheight, marginwidth, multiple, nohref, noresize, noshade, nowrap, ref, rel, rev, rows, rowspan, scrolling, shape, span, summary, tabindex, title, usemap, valign, value, vlink, vspace, width

I didn't know value was safe though

iStryker’s picture

Here is a patch that (kinda) works. There is one problem it doesn't read the changes in the YAML file. I cannot figure out why. YAML files are still new to me. It works because alt and title are add to the array if the configuration information isn't found.

UPDATE: #1998466: Convert filter_xss_admin and similar function to an Xss component just got committed and broke this patch.

iStryker’s picture

Ok I have fixed the patch to work with the converted Xss.php class file. It still doesn't load the config in _drupal_code_bootstrap() and I don't know why.

Can anyone explain why?

ParisLiakos’s picture

+++ b/core/includes/common.incundefined
@@ -4349,6 +4349,15 @@ function _drupal_bootstrap_code() {
+  $allowed_attributes = \Drupal::config('system.filter')->get('attributes');

no need for \ before Drupal class..we only enforce those in namespaced code

+++ b/core/lib/Drupal/Component/Utility/Xss.phpundefined
@@ -22,6 +22,23 @@ class Xss {
+   * ¶

@@ -224,8 +242,14 @@ protected static function attributes($attributes) {
+            } ¶

trailing space, that should be removed prior to commit

+++ b/core/lib/Drupal/Component/Utility/Xss.phpundefined
@@ -224,8 +242,14 @@ protected static function attributes($attributes) {
+              $thisval = $match[1];

@@ -236,8 +260,12 @@ protected static function attributes($attributes) {
+              $thisval = UrlValidator::filterBadProtocol($match[1]);
...
+              $thisval = $match[1];

@@ -247,7 +275,12 @@ protected static function attributes($attributes) {
+              $thisval = UrlValidator::filterBadProtocol($match[1]);
...
+              $thisval = $match[1];

$thisval variable should be more explanatory..maybe $filtered_variable..dunno

iStryker’s picture

Title: String stripped from title attribute if contains a colon » String stripped from title and alt attribute if contains a colon
Priority: Minor » Major
Issue tags: +Accessibility

Ok config does load properly. I had to either replace core\modules\system\config\system.filter.yml in your sites/[mysite]/files/config_XXXXXXX directory or place core\modules\system\config\system.filter.yml into the staging directory and sync the configuration.

FYI this is a major issue as it breaks accessibility when there is a colon in the alt and title attribute

iStryker’s picture

Patch updated

iStryker’s picture

Status: Needs work » Needs review

Changing to needs review

ParisLiakos’s picture

triggering testbot

ParisLiakos’s picture

Priority: Major » Normal

also this is not major

Status: Needs review » Needs work
Issue tags: -Accessibility, -dlatino, -drupalcr

The last submitted patch, drupal-whitelist_attributes-952964-20.patch, failed testing.

iStryker’s picture

Status: Needs work » Needs review
mgifford’s picture

Status: Needs review » Needs work

The last submitted patch, drupal-whitelist_attributes-952964-20.patch, failed testing.

Liam Morland’s picture

Status: Needs work » Needs review
mgifford’s picture

Status: Needs review » Needs work

The last submitted patch, drupal-whitelist_attributes-952964-20.patch, failed testing.

mgifford’s picture

Issue summary: View changes
Status: Needs work » Needs review
FileSize
6.25 KB

reroll

Status: Needs review » Needs work

The last submitted patch, 32: drupal-whitelist_attributes-952964-32.patch, failed testing.

mgifford’s picture

Status: Needs work » Needs review
FileSize
8.15 KB
mgifford’s picture

Status: Needs review » Needs work
Issue tags: +Needs reroll

no longer applies.

Rajendar Reddy’s picture

Status: Needs work » Needs review
FileSize
8.21 KB

Updating patch with reroll. Please review.

Status: Needs review » Needs work

The last submitted patch, 36: drupal-whitelist_attributes-952964-36.patch, failed testing.

mgifford’s picture

Ran into:

PHP Fatal error: Class 'Drupal\Component\Utility\Url' not found in /var/lib/drupaltestbot/sites/default/files/checkout/core/lib/Drupal/Component/Utility/Xss.php on line 268

This ultimately led me to #2191873: Document the WYSIWYG XSS filtering concept and architecture for developers surprised at how complicated this patch is to just allow a ":" in a title/alt text.

smira’s picture

Reroll now correctly uses the UrlHelper class again, not sure why it was changed to Url

smira’s picture

Status: Needs work » Needs review
Issue tags: -Needs reroll
Wim Leers’s picture

Status: Needs review » Needs work

This implies FilterHtml::getHTMLRestrictions() must also be updated.

This exact same problem also occurs for data-* attributes. I didn't know about this issue until today, so I was trying to fix this same problem without whitelisting attributes over at #2105841: Xss filter() mangles image captions and title/alt/data attributes. However, this approach seems quite a bit simpler, so … could you please also add support for wildcard attributes, so we could allow data-*, which would match data-align, data-caption, data-foo …?

eaton’s picture

This exact same problem also occurs for data-* attributes

A very similar problem also appears in namespaced HTML tags: #2321061: Xss::split() fails on custom HTML elements with colons in the name

mgifford’s picture

cs_shadow’s picture

Status: Needs work » Closed (duplicate)

This works fine now since #2105841: Xss filter() mangles image captions and title/alt/data attributes got in so closing this one as duplicate.

Liam Morland’s picture

Will that fix be backported to D7?

cs_shadow’s picture

@Liam Morland, if this is to be backported, this issue is not the right place. Since this issue didn't had a patch, there's nothing to backport. You'll be better off asking this question on the original issue: #2105841: Xss filter() mangles image captions and title/alt/data attributes. Since its a security fix it should be ported to D7 too if it also has a similar vulnerability but I'm not sure about that.