Not all web addresses turn into links. Currently, only web addresses starting with the protocol ('http(s)://') or the 'www' prefix will turn into links. Use of the www prefix is declining and excludes valid web addresses like:
We now have a regex matching on http and www, but we could include a third option: match on the known top level domains (generic and countries).
Two improvements we could do:
1. We could introduce matching to a valid top level domain. We then have to include the 220 top level domains (org|com|uk|ly etc) and that will fix the addressses above.
2. Matching the new private TLD's (*.anything) will be probably harder if we don't want false positives.
If such a TLD-address has a trail we could probably safely match them:
Twitter does a very good job to convert all kind of URL's in all kind of situations to shortener links.
Code can be found here:
A PHP version based on that code is created here: https://github.com/stephenbeckett/TwitterURLMatchPHP