Problem/Motivation
https://api.drupal.org/api/drupal/includes!common.inc/function/drupal_at...
Attribute values are sanitized by running them through check_plain(). Attribute names are not automatically sanitized. When using user-supplied attribute names, it is strongly recommended to allow only white-listed names, since certain attributes carry security risks and can be abused.
Does not mention that URLs need to be run through filter_xss_bad_protocol() first.
https://api.drupal.org/api/drupal/core%21lib%21Drupal%21Core%21Template%...
Collects, sanitizes, and renders HTML attributes.
The attribute keys and values are automatically sanitized for output with Html::escape() and the entire attribute string is marked safe for output.
The only place that mentions this is:
https://api.drupal.org/api/drupal/core%21lib%21Drupal%21Component%21Util...
https://api.drupal.org/api/drupal/includes%21common.inc/function/drupal_...
Both functions give the impression that they successfully santize attribute values, when they don't if what is going to be sanitized is going to end up in an href, src etc.
Proposed resolution
Add documentation to Attribute in 8.x, then drupal_attributes() in 6.x/7.x that you need to strip dangerous protocols from URLs before passing them in.
Remaining tasks
User interface changes
API changes
Data model changes
Comment | File | Size | Author |
---|---|---|---|
#22 | 2567741_22.patch | 1.17 KB | Sivaji_Ganesh_Jojodae |
#15 | 2567741-15-attribute-protocol-filtering-doc.patch | 1.35 KB | mr.baileys |
#9 | 2567741-9-attribute-protocol-filtering-doc.patch | 1.39 KB | mr.baileys |
Comments
Comment #2
xjmComment #3
mr.baileysDrupalCon Extended Sprints.
Comment #4
stefan.r CreditAttribution: stefan.r commentedIn #2565895: Add a new :placeholder to SafeMarkup::format() for URLs that handles bad protocols we also mention on* and style attributes - could be worth repeating here
Comment #5
pwolanin CreditAttribution: pwolanin at Acquia commentedComment #6
mr.baileysdrupal_attributes() has been removed in D8, so the attached patch only adds documentation for the Attribute class.
Comment #7
stefan.r CreditAttribution: stefan.r commented1. s/an URI/a URI/
2. Should we also rewrite the immediately preceding paragraph in case someone reads that without reading the addition in this patch, I feel like it still hints at safeness a bit too much
Comment #8
mr.baileysThanks for the feedback, I merged both paragraphs into one (since they both concern security), and replaced "sanitize" with "escape" in the first part (since sanitization implies completely safe, which is not the case for URIs that require protocol filtering).
Comment #9
mr.baileysPrevious patch file was incomplete.
Comment #12
stefan.r CreditAttribution: stefan.r commentedLooks great :)
Comment #13
pwolanin CreditAttribution: pwolanin as a volunteer and at Acquia commented"the entire attribute string is marked safe for output" is not actually true. The object itself implements SafeStringInterface, so it will not be escaped by Twig.
Comment #14
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedAs in the other issue, I'm still a little iffy on having the advice in the documentation contradict what Drupal actually does itself in Xss::attributes() (apply protocol-filtering to all attributes, except "title", "alt", and "data-*").
On the other hand, we know that's more conservative than it needs to be and in practice telling people to manually run UrlHelper::stripDangerousProtocols() on almost every attribute is overkill, so I guess it's OK but is there a way we can be more specific here about how you know when you really really need to run it? Like a reference somewhere to what we mean by "URI attribute"? (Not necessarily a blocker for this issue though - documenting it as above is already leaps and bounds better than not documenting it at all.)
Comment #15
mr.baileysSlightly reworded after talking to @pwolanin.
@David_Rothstein: I changed it to read "No protocol filtering is applied, so when using user-entered input as a value for an attribute that expects an URI (href, src, ...)," ... which might be clearer? I don't think it's feasable to give an exhaustive list here like the on at http://stackoverflow.com/questions/2725156/complete-list-of-html-tag-att...
Comment #16
stefan.r CreditAttribution: stefan.r commentedAs far as HTML5 is concerned, likely every attribute listed in http://www.w3.org/html/wg/drafts/html/master/index.html#attributes-1 that has the word URL in it. But I'd be a bit wary of explicitly mentioning them all here, we only really know that href works. In another issue it could be interesting to do the protocol filtering in Attribute itself, based on this list of attributes...
What I don't see is any attributes that might be a URL, or might not be a URL, it's usually pretty clear. So marking RTBC, David_Rothstein if you object feel free to set back to NR.
Comment #18
catchLooks great to me. Committed/pushed to 8.0.x, thanks!
Moving to 7.x for backport.
Comment #19
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedThe final version looks good to me too (though it should say "a URI" rather than "an URI").
Maybe in the backport we should add similar documentation to format_string() here too, since the new placeholder added in #2565895: Add a new :placeholder to SafeMarkup::format() for URLs that handles bad protocols may not make sense to backport (it could be done, but we couldn't actually use it consistently in core without breaking lots of translations).
Comment #20
catchYes I"d definitely add to format_string() and possibly t() itself in 7.x
Comment #22
Sivaji_Ganesh_Jojodae CreditAttribution: Sivaji_Ganesh_Jojodae commentedPatch ported to D7 (not sure if I'm correct).
Comment #24
jhodgdonThis looks pretty good to me... Part of it is a bit confusing though.
The first sentence in this paragraph starts with "Attribute values...". So far, so good. Then the next one starts in about the attribute names. Then the third sentence, about "No protocol filtering" is about the values again. Then the fourth is about the names again.
Can we rearrange this so that the parts about the values are together, and the parts about the names are together? This would make much more sense. Thanks!