Closed (works as designed)
Project:
Drupal core
Version:
7.x-dev
Component:
base system
Priority:
Critical
Category:
Bug report
Assigned:
Unassigned
Issue tags:
Reporter:
Created:
3 Feb 2009 at 11:56 UTC
Updated:
20 May 2010 at 19:20 UTC
The code in http://api.drupal.org/api/function/valid_url/7 needs to support IDN (International domain names). The line that needs to be fixed should be:
(?:[a-z0-9\-\.]|%[0-9a-f]{2})+ # A domain name or a IPv4 address
I have no ready regex for this validation... maybe someone else?
Comments
Comment #1
mfer commentedYes, we need to fix this. There are two parts to this...
First, would the name valid_url be correct? A url is a subset of a uri. That does not allow international characters (only ascii). Instead of a uri we would use an iri and the subset of that is a irl. This may be a matter of semantics but I'm still asking.
The current implementation is based on the spec RFC 3986. This is for uris. The iri spec is RFC 3987 and is still a draft/proposed standard. That being said, International domain names are out in the wild so this is a must.
We need to update the domain name and the path. What about the schema portion (http part)?
I think we need to replace \w with \pL_, a-z with \pL and 0-9 with \pN.
If someone writes up some tests for this I'll update the regex (unless someone else wants to).
Comment #2
hass commentedIDN support would only require an update to the domain name/hostname validation... the other parts don't need to change. I would also need http://drupal.org/node/295021#comment-1235860.
Comment #3
mfer commented@hass - well, if we are going to go international should we limit it to IDN or flat out allow routable urls like http://例え.テスト/メインページ (ICANN site)?
If we are going to allow international characters, and we should, we should allow them everywhere they will be used in a url, irl, or what ever.
Comment #4
hass commentedHow should we ever check this with a regex? :-)
Comment #5
alexanderpas commentedComment #6
alexanderpas commentedpostponed until #389278: Create IDN encoding and decoding functions is in.
Comment #7
dropcube commentedSubscribe
Comment #8
marcvangend#389278: Create IDN encoding and decoding functions has been moved to D8 with priority 'normal'. What to do with this issue?
Comment #9
mfer commented@marcvangend I'm marking this issue 'by design'. The intent of valid_url is to validate against urls. We are now talking about the iri space and not the uri space.
So, the current setup is by design. The path forward of encoding/decoding along with validation to handle idns is in that other issue. We can work from there.