Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
This task has been created as a central place to develop the IDN/Punycode encode and decode functions.
http://www.phpclasses.org/browse/package/1509.html (LGPL) might be interesting as a starting point.
related issues:
#308138: Make valid_email_address() support IDNs
#295021: filter_var() with FILTER_VALIDATE_URL accepts malformed URLs and rejects not all valid URLs
#368472: valid_url() marks correct IDN domains as invalid
Comments
Comment #1
hass CreditAttribution: hass commentedSubscribe
Comment #2
Reg CreditAttribution: Reg commentedsubscribe
Comment #3
dropcube CreditAttribution: dropcube commentedSubscribe
Comment #4
mattyoung CreditAttribution: mattyoung commentedsubscribe
Comment #5
Breakerandi CreditAttribution: Breakerandi commentedsubscribe
Comment #6
chx CreditAttribution: chx commentedComment #7
mfer CreditAttribution: mfer commentedSubscribe
Comment #8
BurakD CreditAttribution: BurakD commentedsubscribe
Comment #9
chx CreditAttribution: chx commentedI am so happy that we have seven subscribes! So who is starting a contrib project to create the code, let it mature and then get into D8.
Comment #10
jcfiala CreditAttribution: jcfiala commentedI might, chx, as I've been dealing with IDN-related stuff as part of the link module.
That said, I want to convert Link over to D7 first.
I welcome people providing useful references and suggestions on how to handle this problem.
Comment #11
hass CreditAttribution: hass commentedWhile looking around the net how we can solve the validation issue I found an idea to convert into ASCII and validate than. So, converting every URL with http://www.php.net/manual/en/function.idn-to-ascii.php may be an easy solution. After the conversion, run the old fashioned valid_url(). It sound too easy to me, but it may work!?
Comment #12
alexanderpas CreditAttribution: alexanderpas commenteddo note that this solution is only availble for PHP5.3+
Comment #13
hass CreditAttribution: hass commentedYeah, but better than having no solution :-)
Comment #14
mfer CreditAttribution: mfer commentedAlso note that a lot of environments have idn_to_ascii compiled out. For example, MAMP. Many drupal developers use MAMP. Note that there is a ticket in MAMP to correct this.
Comment #15
Konstantin Komelin CreditAttribution: Konstantin Komelin commentedYear ago I had an idea to create utility module based on Matthias Sommerfeld's library. But if I understand correctly it's not allowed to include LGPL code into the Drupal project.
I still use that library for one of my projects and it works well for Russian domains like www.с-днем-рождения.рф
So, we can ask Matthias to relicense it to GPL or Dual.
If Matthias accepts I will create contrib project with two easy wrappers, smth like drupal_idna_encode($domain) /drupal_idna_decode($punycode)
Does it make sense, guys?
Comment #16
Panchoidn_to_ascii() is available in PHP 5.3, so we can close this for D8.
Also, we most probably don't want to add a library just to backport this to D7.
Comment #17
Mixologicidn_to_ascii() isn't 'available', its only an option if you have the intl php extension loaded. (see comment here: https://drupal.org/node/1427516#comment-8756179)
Additionally, it has a bug in it where it will fail silently and return an empty string if the string passed into it is > 68 characters.
(https://bugs.php.net/bug.php?id=67084)
In order to have this in drupal 8, I recommend that we vendor a git repo like the following:
https://github.com/true/php-punycode (looks like MIT license)
Or something like this:
http://phlymail.com/en/downloads/idna-convert.html
(GPL)(LGPL)Or this if we want a js solution for whatever front end needs we might have.
https://github.com/bestiejs/punycode.js (MIT License)
That way we dont have to worry about environmental configurations or bugs and we can use those encode/decode libraries on other issues.
Comment #18
Konstantin Komelin CreditAttribution: Konstantin Komelin commentedLGPL! See #15
Comment #19
MixologicWhoops. I misread the page- but the point being that there are plenty of punycode implemetations out there, and core should support Internationalized Domain Names. There are issues all over core and contrib that relate to this, and if we'd like for Drupal to have any adoption in countries that support Internationalized Domain Names, then this is pretty critical to that objective. Im not sure if this is something that could be achieved in contrib.. but maybe?
Comment #20
Konstantin Komelin CreditAttribution: Konstantin Komelin commentedThank you @Mixologic for your opinion.
I tend to think that idn functionality should be in core. I'm not sure if it's possible now because of unusual "code freeze" (or feature freeze, can't characterize it) stage of the D8.
I also think that we can implement the API as a contrib for D7. That was my original idea.
And thanks for the links to Punycode libs. They can be useful for my projects.
Best,
Konstantin
Comment #21
Anonymous (not verified) CreditAttribution: Anonymous commentedSubscribe
Comment #22
Anonymous (not verified) CreditAttribution: Anonymous commentedI think that this issue should be corrected in Drupal 7, also, as many sites will be using 7 for years to come.
Comment #23
maggie_s CreditAttribution: maggie_s commentedIn Drupal 7.
I used idn_to_ascii in order to encode my URL's and the valid_url($url, TRUE) for validation.
Validation is not correct.
"http://akademie-für-gestaltung-regensburg.de/", encoded into "xn--http://akademie-fr-gestaltung-regensburg-0fe.de/" isn't validated.
So solution #11 doesn't work doesn't work for me.
I installed idna_convert (konstantin.komelin conversion proposal), and "http://akademie-für-gestaltung-regensburg.de/", encoded into "http://xn--akademie-fr-gestaltung-regensburg-uxb87k.de/", validates fine.
However, using valid_url, "http://www.", "http://." and "http://www" are considered valid, when they aren't.
If I user filter_var, "http://www." and "http://." are invalid, but "http://www" is valid.
Are there any other solution proposals?
Comment #24
Konstantin Komelin CreditAttribution: Konstantin Komelin commented@maggie_drupal write your own regular expression ;)
Comment #25
Konstantin Komelin CreditAttribution: Konstantin Komelin commentedAnd please don't check protocols, check domain names
Comment #26
Konstantin Komelin CreditAttribution: Konstantin Komelin commentedAlright, I've created IDNA Convert module that provides the conversion functions for D7. It works perfectly.
Please note it includes LGPL library, which use has not been totally approved yet.
Feel free to port it to D8 with a couple of lines of code if you wish.