Skip to content
Advertisement

Tag: utf-16

Extract substring by utf-8 byte positions

I have a string and start and length with which to extract a substring. Both positions (start and length) are based on the byte offsets in the original UTF8 string. However, there is a problem: The start and length are in bytes, so I cannot use “substring”. The UTF8 string contains several multi-byte characters. Is there a hyper-efficient way of

JavaScript strings – UTF-16 vs UCS-2?

I’ve read in some places that JavaScript strings are UTF-16, and in other places they’re UCS-2. I did some searching around to try to figure out the difference and found this: Q: What is the difference between UCS-2 and UTF-16? A: UCS-2 is obsolete terminology which refers to a Unicode implementation up to Unicode 1.1, before surrogate code points and

Advertisement