The current list of allowed protocols needs to be updated to allow iTunes subscription urls to be embedded without url encoding.

I would also think we need a better way to admin this list beyond hard coding within the filter module. I imagine that this list will always need updating as new protocols are developed. It used to be easier to keep up with new protocols, but I imagine that this may become more difficult over time.

Doesn't it make better sense nowadays if this list was administered instead of coded? My two cents.

thanks,
Katrina
www.ambereyes.net

Comments

sun’s picture

Title: Add itpc to list of allowed protocols » URL filter: Use an internal variable for list of allowed protocols
Version: 5.7 » 7.x-dev
Category: feature » task
sun’s picture

Title: URL filter: Use an internal variable for list of allowed protocols » URL filter: Use filter_allowed_protocols variable
Status: Active » Needs review
StatusFileSize
new3.59 KB

Works and passes for me.

dries’s picture

+++ modules/filter/filter.module	4 Mar 2010 19:04:29 -0000
@@ -1111,8 +1111,19 @@ function _filter_url($text, $filter) {
+  foreach ($protocols as $key => $value) {
+    $protocols[$key] .= ':';
+    if ($value != 'mailto') {
+      $protocols[$key] .= '//';
+    }
+  }

I think we should drop this magic and add "://" to each entry in the $protocols array. Why? RFC 1738 (Uniform Resource Locators) describes a URL as follows: <scheme>:<scheme-specific-part>. The "//" is used to indicate that the scheme-specific-part (not the scheme itself!) complies with a common scheme syntax (i.e. username@password:domain:port). Having '//' is not a hard requirement. For example, a news URL takes one of two forms:

     news:<newsgroup-name>
     news:<message-id>

The nntp:// URL scheme is an alternative method of referencing news articles but news: is valid too. In other words, the current code makes assumptions that are not part of the specification. That is why I recommend removing the magic to automatically add "://".

PS: Skype uses 'callto://' which might be worth adding?

sun’s picture

+++ modules/filter/filter.module	4 Mar 2010 19:04:29 -0000
@@ -1111,8 +1111,19 @@ function _filter_url($text, $filter) {
+  // @see filter_xss_bad_protocol()
+  $protocols = variable_get('filter_allowed_protocols', array('ftp', 'http', 'https', 'irc', 'mailto', 'news', 'nntp', 'rtsp', 'sftp', 'ssh', 'telnet', 'webcal'));

Hm. Problem with that is that the filter_allowed_protocols variable is effectively *re-used* here. The variable already exists and is used like since always in filter_xss_bad_protocol().

And filter_xss_bad_protocol() would become very complex if the protocols would already contain those suffixes/separators.

I'd therefore recommend to just enhance the PCRE to always optionally match (?://) after $protocols...

+++ modules/filter/filter.module	4 Mar 2010 19:04:29 -0000
@@ -1111,8 +1111,19 @@ function _filter_url($text, $filter) {
+  $text = preg_replace_callback("`(<p>|<li>|<br\s*/?>|[ \n\r\t\(])(({$protocols})([a-zA-Z0-9@:%_+*~#?&=.,/;-]*[a-zA-Z0-9@:%_+*~#&=/;-]))([.,?!]*?)(?=(</p>|</li>|<br\s*/?>|[ \n\r\t\)]))`i", '_filter_url_parse_full_links', $text);

d'oh. Looking more closely at the already matches characters after $protocols, '//' is already matched.

Powered by Dreditor.

sun’s picture

StatusFileSize
new3.52 KB

Works!

Whether skype: or any other protocol belongs into this default list is a separate issue. At first sight, my gut feeling was: If we add skype:, then what about xmpp: or jabber: or whatever else? Also, Skype is proprietary... and, heh, searching for "skype" in contrib quickly revealed: http://drupal.org/project/filter_protocols by Dave Reid ;) We may want to consider to move that functionality into core.

sun’s picture

This looks pretty RTBC to me, and I'd like to move on to other input filter processing bugs in the queue, from which most need to change the very same lines of code.

As mentioned before, I'd really like to see a separate issue to quickly move http://drupal.org/project/filter_protocols into core.

Wait.

Would it make sense to configure the list of protocols per text format?

sun’s picture

Would it make sense to configure the list of protocols per text format?

No, let's defer that to D8.

RTBC, anyone?

corey.aufang’s picture

Coming in from #753278: URL Filter converts unsupported protocols and this looks good.

Garrett Albright’s picture

The project description page for my Drupher module has a similar situation to what corey.aufang is describing in #753278: URL Filter converts unsupported protocols. I typed in a Gopher URL and the filter is unhelpfully chopping off the "gopher:" from the front when it automatically linkifies it. Not linkifying it at all would have been better behavior.

retester2010’s picture

#5: drupal.filter-url-protocols.5.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, drupal.filter-url-protocols.5.patch, failed testing.

sun’s picture

Status: Needs work » Postponed
sun’s picture

Status: Postponed » Fixed

This had to be incorporated into #161217: URL filter breaks generated href tags to make the new URL filter tests pass.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.