Skip to content
Advertisement

Converting punycode with dash character to Unicode

I need to convert the punycode NIATO-OTABD to nñiñatoñ.

I found a text converter in JavaScript the other day, but the punycode conversion doesn’t work if there’s a dash in the middle.

Any suggestion to fix the “dash” issue?

Advertisement

Answer

I took the time to create the punycode below. It it based on the C code in RFC 3492. To use it with domain names you have to remove/add xn-- from/to the input/output to/from decode/encode.

The utf16-class is necessary to convert from JavaScripts internal character representation to unicode and back.

There are also ToASCII and ToUnicode functions to make it easier to convert between puny-coded IDN and ASCII.

JavaScript
JavaScript

Licence:

From RFC3492:

Disclaimer and license

Regarding this entire document or any portion of it (including the pseudocode and C code), the author makes no guarantees and is not responsible for any damage resulting from its use. The author grants irrevocable permission to anyone to use, modify, and distribute it in any way that does not diminish the rights of anyone else to use, modify, and distribute it, provided that redistributed derivative works do not contain misleading author or version information. Derivative works need not be licensed under similar terms.

I put my work in this punycode and utf16 in the public domain. It would be nice to get an email telling me in what project you use it.

The scope of the code

Each TLD has rules for which code points are allowed. The scope of the code below is to encode and decode a string between punycode and the internal encoding used by javascript regardes of those rules. Depending on your use case, you may need to filter the string. For example, 0xFE0F: Variation Selector-16, an invisible code point that specifies that the previous character should be displayed with emoji presentation. If you search for “allowed code points in IDN” you should find several projects that can help you filter the string.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement