Hi,

I've found that Twitter URLs (e.g. http://twitter.com/#!/listingships) don't validate, presumably due to rfc3986 defining "#" and "!" as reserved characters. I'm not certain that it really is an invalid URL, but I'm no expert on rfc3986...

I've disabled the validation so it's not an immediate problem, but it might be helpful if this sort of common URL abuse could be allowed - perhaps through a strict/loose validation switch.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

jcfiala’s picture

Hm. It seems to me that this is something specific enough that we could add an exception to let through that particular case: ie, that 'twitter.com/#!' is a valid part of the path.

I welcome patches, naturally.

js’s picture

Hi and thank you for your work and code.

I also have a problem with twitter links and I am using D7 and don't see how to turn off validating in this version. Is there a way?

jcfiala’s picture

No, there isn't yet. Unfortunately I didn't notice that the update to D7 patch I got earlier seemed to remove some features - I'm not sure what happened, really. This, and my severe lack of spare time, is why we're still at an alpha state with Link in D7.

sreynen’s picture

Title: Looser URL validation » URL validation should match RFC 3986
Version: 6.x-2.9 » 7.x-1.x-dev
Category: feature » bug
Status: Active » Needs work
FileSize
1.19 KB

Deviations from the RFC seems like a bug. This patch isn't very thoroughly tested, but it solves the Twitter hashbang problem at least.

I found the various patterns in link_validate_url() very difficult to compare to the RFC, so I wrote new patterns with direct mapping to the RFC. I didn't include pct-encoded, which should definitely be included, I just didn't need it and didn't have time to add it. So this still needs work, but it's a start.

jcfiala’s picture

Interesting. I started digging into RFC 3986 regex handling, and my general verdict now is that it's hard to do right.

Currently I'm working on melding some of what's being done in valid_url() with link_validate_url().

GreenReaper’s picture

For what it's worth, stuff like "//wikifur.com/" (a protocol-relative URI) should ideally work as well, because it facilitates sites which work on both HTTP and HTTPS. Hacked it by enclosing the protocol selector and trailing : with ( )? in link_validate_url.

dqd’s picture

Status: Needs work » Closed (duplicate)
Issue tags: +field validation

Dear followers of this issue: please read the project page info of link module for the further way to go for URL validation issues. There is already an main issue to collect and discuss ALL possible validation scenarios in general. That's why I will mark this one here as duplicate. I need all concentration inside the ONE and only discussion to move forward. After a D7 implementation we will provide a D6 backport.

Explanation: There are too many corner cases and URL validation feature requests of users to implement them all one after the other. We would have a 40 input fields and checkbox lines cluttered settings form for URL validation methods only, conflicting with each other randomly. I think, the better way is to find a maybe more complex but all-embracing new configuration method, which lets the admin better decide, how and when to validate the url. Including a good description which helps to set it up. This will surely lead to a new branch