Tag: utf-16

Extract substring by utf-8 byte positions

character-encoding javascript string utf-16 utf-8

I have a string and start and length with which to extract a substring. Both positions (start and length) are based on the byte offsets in the original UTF8 string. However, there is a problem: The start and length are in bytes, so I cannot use “substring”. The UTF8 string contains several multi-b…

JavaScript strings – UTF-16 vs UCS-2?

javascript utf-16

I’ve read in some places that JavaScript strings are UTF-16, and in other places they’re UCS-2. I did some searching around to try to figure out the difference and found this: Q: What is the difference between UCS-2 and UTF-16? A: UCS-2 is obsolete terminology which refers to a Unicode implementat…