Library for International Domain Names (IDNA) 2008
Historically, we've been able to use domain names composed only by ASCII characters (for instance:
A new technique, called Internationalized Domain Names (IDN for short), allows you to use most of the Unicode characters, so that you can have for instance
To grant compatibility with all the existing software that makes internet work, domain names containing non-ASCII characters are represented in Punycode, a special format that uses ASCII-only characters.
The generation of Punycode starting from an IDN should be case insensitive: browsing to
www.example.com should be the same as browsing to
In PHP, converting strings to lower case is as easy as calling
strtolower, but this function does not work with characters outside the ASCII characters (in fact, it may mess up the IDN names).
If you have the
mbstring PHP extension, you may think to use the
mb_strtolower PHP function it offers.
By the way, even
mb_strtolower isn't a good choice, for these reasons:
mbstringPHP extension may not be available
mb_strtolowerbehaviour changes across different PHP versions (for instance,
Ԩis correctly converted to
ԩfor PHP 7.0, but prior versions kept
mb_strtolowerdoes not translate a lot of Unicode characters that are suggested by the standards
Unicode offers a mapping table with the recommended mapping (for instance, case normalization like
a, but also
There are two standards that define the mapping that should be applied to IDN, IDNA2003 and IDNA2008. IDNA2008 is backward compatible with IDNA2003, but there are some incompatible differences.
For instance, IDNA2003 required that
ß mapped to
ss, whereas IDNA2008 allows the usage of
ß. So, older browsers and client softwares resolved
www.schloß.com to the Punycode corresponding to
www.schloss.com, whereas newer browsers resolve it to the Punycode of
Since the resulting Punycode is different (it's called deviation), this lead to big security issues, and you need to know that a domain name is deviated.
- no dependencies from any PHP extension
- not dependent from any other PHP library
- consistency across different PHP versions
- results are granted to follow the standards (it's not just a bare multibyte to punycode conversion library)
- designed with speed in mind
- compatible with PHP 5.3
use MLocati\IDNA\DomainName; require_once 'autoload.php'; // Not required if you use composer $domain = \MLocati\IDNA\DomainName::fromName('www。schloß.COM'); echo "Name: ", $domain->getName(), "\n"; echo "Punycode: ", $domain->getPunycode(), "\n"; echo "Deviated: ", $domain->isDeviated() ? 'yes' : 'no', "\n"; echo "Deviated Name: ", $domain->getDeviatedName(), "\n"; echo "Deviated Punycode: ", $domain->getDeviatedPunycode(), "\n";
Name: www.schloß.com Punycode: www.xn--schlo-pqa.com Deviated: yes Deviated Name: www.schloss.com Deviated Punycode: www.schloss.com