Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
If an html title attribute of a link contains a colon (as in "a: b") only the part ofter the colon (" b") ends up in the html code. This is incorrect behavior.
This problem was reported in the WYSIWYG module bugs (http://drupal.org/node/878926), but I see the same behavior without that module.
Example input:
<a href="#" title="foo: bar">Text</a>
Generated HTML:
<p><a href="#" title=" bar">Text</a></p>
Comments
Comment #1
patrickdepinguin CreditAttribution: patrickdepinguin commentedmarkus_petrux pointed out to me that this behavior is a 'feature' in the filter module, related to cross-site scripting (see http://api.drupal.org/api/function/filter_xss_bad_protocol/6). Everything to the left of a colon is considered a protocol, and should be in the whitelist of protocols.
In my opinion, this checking is not needed (and incorrect) for attributes such as the title attribute of a link tag.
By the way, it is unclear to me to which component this bug relates, i.e. how the filter_xss_bad_protocol function is called).
Comment #2
dddave CreditAttribution: dddave commentedI guess this is the right component.
Comment #3
sunComment #4
sunSorry, but it's too late and we have much more important issues to deal with right now. A fix for this may be backported from D8.
Comment #5
chris.guitarte CreditAttribution: chris.guitarte commentedAny word on the status of this? Just wondering how it sits in the queue. We're still seeing this behavior in 6.22 - but for the alt attribute.
Example Input:
<img alt="annen:ascj" src="/sites/default/files/AnnenbergA_75p.jpg" width="75" height="75" />
Example Output:
Example:
<img alt="ascj" src="/sites/default/files/AnnenbergA_75p.jpg" width="75" height="75" />
Thanks!
Chris
Comment #6
sunComment #7
arski CreditAttribution: arski commentedyeah, there should be a whitelist of attributes for which this check is unnecessary I suppose :x
Comment #8
Hanno CreditAttribution: Hanno commentedpossible related: #1328768: attributes 'xml:lang' and 'xml:id' transform to 'lang' and 'id' in filter_xss
Comment #9
alemadleiBased on the initial comment, I added a condition to check for attributes that include the colon as part of their name. After this, the stated test cases worked as expected.
This happens when namespaced attributes are being used.
Comment #10
OlafskiHello, will a fix be backported to D7? There are quite common use cases for a colon in the title attribute, e.g. for adding copyright information to images.
Comment #11
mike booth CreditAttribution: mike booth commentedUnless I am being clueless and misreading the patch at #9, it addresses the issue of attribute names with colons in them (which might help with e.g. #1328768: attributes 'xml:lang' and 'xml:id' transform to 'lang' and 'id' in filter_xss) but doesn't address attribute values with colons in them, which is the topic of this issue.
It does look as though Drupal's filter.module merrily calls filter_xss_bad_protocol() on every attribute value it ever sees, regardless of the attribute name or the tag type. The good thing about this is that it's maximally paranoid! The bad thing is that it zaps all colons from any attribute value, ever, unless the text before the colon just happens to match a whitelisted URI protocol.
One could fix this by whitelisting certain attribute names, like "title" and "alt" and... (any other suggestions?) Or one could do that more conservatively by whitelisting certain tag/attribute pairs - but that approach would appear to require more extensive refactoring in filter.module.
Comment #12
DevElCuy CreditAttribution: DevElCuy commentedIn order to reproduce the bug, make sure that your "text format" is using "Limit allowed HTML tags " filter.
Comment #13
DevElCuy CreditAttribution: DevElCuy commented#12 is not working any more, debugging.
Comment #14
Liam MorlandThis filtering should only be done on attributes that can actually be a URL. A whitelist of attributes that are plaintext is probaby a good way to do this.
Comment #15
iStryker CreditAttribution: iStryker commentedWhitelist from https://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%29_Preventio...
I didn't know value was safe though
Comment #16
iStryker CreditAttribution: iStryker commentedHere is a patch that (kinda) works. There is one problem it doesn't read the changes in the YAML file. I cannot figure out why. YAML files are still new to me. It works because alt and title are add to the array if the configuration information isn't found.
UPDATE: #1998466: Convert filter_xss_admin and similar function to an Xss component just got committed and broke this patch.
Comment #17
iStryker CreditAttribution: iStryker commentedOk I have fixed the patch to work with the converted Xss.php class file. It still doesn't load the config in _drupal_code_bootstrap() and I don't know why.
Can anyone explain why?
Comment #18
ParisLiakos CreditAttribution: ParisLiakos commentedno need for
\
before Drupal class..we only enforce those in namespaced codetrailing space, that should be removed prior to commit
$thisval variable should be more explanatory..maybe $filtered_variable..dunno
Comment #19
iStryker CreditAttribution: iStryker commentedOk config does load properly. I had to either replace core\modules\system\config\system.filter.yml in your sites/[mysite]/files/config_XXXXXXX directory or place core\modules\system\config\system.filter.yml into the staging directory and sync the configuration.
FYI this is a major issue as it breaks accessibility when there is a colon in the alt and title attribute
Comment #20
iStryker CreditAttribution: iStryker commentedPatch updated
Comment #21
iStryker CreditAttribution: iStryker commentedChanging to needs review
Comment #22
ParisLiakos CreditAttribution: ParisLiakos commentedtriggering testbot
Comment #23
ParisLiakos CreditAttribution: ParisLiakos commentedalso this is not major
Comment #25
iStryker CreditAttribution: iStryker commented#20: drupal-whitelist_attributes-952964-20.patch queued for re-testing.
Comment #26
mgifford#20: drupal-whitelist_attributes-952964-20.patch queued for re-testing.
Comment #28
Liam Morland#20: drupal-whitelist_attributes-952964-20.patch queued for re-testing.
Comment #29
mgifford#20: drupal-whitelist_attributes-952964-20.patch queued for re-testing.
Comment #32
mgiffordreroll
Comment #34
mgiffordComment #35
mgiffordno longer applies.
Comment #36
Rajendar Reddy CreditAttribution: Rajendar Reddy commentedUpdating patch with reroll. Please review.
Comment #38
mgiffordRan into:
PHP Fatal error: Class 'Drupal\Component\Utility\Url' not found in /var/lib/drupaltestbot/sites/default/files/checkout/core/lib/Drupal/Component/Utility/Xss.php on line 268
This ultimately led me to #2191873: Document the WYSIWYG XSS filtering concept and architecture for developers surprised at how complicated this patch is to just allow a ":" in a title/alt text.
Comment #39
smira CreditAttribution: smira commentedReroll now correctly uses the UrlHelper class again, not sure why it was changed to Url
Comment #40
smira CreditAttribution: smira commentedComment #41
Wim LeersThis implies
FilterHtml::getHTMLRestrictions()
must also be updated.This exact same problem also occurs for
data-*
attributes. I didn't know about this issue until today, so I was trying to fix this same problem without whitelisting attributes over at #2105841: Xss filter() mangles image captions and title/alt/data attributes. However, this approach seems quite a bit simpler, so … could you please also add support for wildcard attributes, so we could allowdata-*
, which would matchdata-align
,data-caption
,data-foo
…?Comment #42
eaton CreditAttribution: eaton commentedA very similar problem also appears in namespaced HTML tags: #2321061: Xss::split() fails on custom HTML elements with colons in the name
Comment #43
mgiffordComment #44
cs_shadow CreditAttribution: cs_shadow commentedThis works fine now since #2105841: Xss filter() mangles image captions and title/alt/data attributes got in so closing this one as duplicate.
Comment #45
Liam MorlandWill that fix be backported to D7?
Comment #46
cs_shadow CreditAttribution: cs_shadow commented@Liam Morland, if this is to be backported, this issue is not the right place. Since this issue didn't had a patch, there's nothing to backport. You'll be better off asking this question on the original issue: #2105841: Xss filter() mangles image captions and title/alt/data attributes. Since its a security fix it should be ported to D7 too if it also has a similar vulnerability but I'm not sure about that.