The problem I am facing to right now is that when there are Emojis with a different skincolor than yellow that Javascript splits it in different chars instead of one.
When I have emojis like this there is no problem an I get the results I want to have.
let strs = [..."πππ€©πππ£π€©"] console.log(strs) console.log(strs.length)
But if I have emojis like this there is a problem because javascript don’t let me use the […] operator with this emojis:
let strs = [..."π§πΎπ¨π»π§πΌπ¦π½π§πΏ"] console.log(strs) console.log(strs.length)
How can I tell Javascript that these is only one Emoji with the length of one and not two or more Emojis like in this example:
let strs = [..."π©ββ€οΈβπβπ©"] console.log(strs) console.log(strs.length)
Advertisement
Answer
The iterator of strings (invoked via the spread syntax ...
) iterates over the code points of the string. Some emojis are made up of multiple code points which causes them to split unintentionally as you have seen. In more recent versions of lodash, you can use _.split()
which is able to handle emojis and ZWJ characters:
const r1 = _.split("π©ββ€οΈβπβπ©", ''); const r2 = _.split("π§πΎπ¨π»π§πΌπ¦π½π§πΏ", ''); // See browser console for output: console.log(r1, r1.length); console.log(r2, r2.length);
<script src="https://cdnjs.cloudflare.com/ajax/libs/lodash.js/4.17.21/lodash.min.js" integrity="sha512-WFN04846sdKMIP5LKNphMaWzU7YpMyCU245etK3g/2ARYbPK9Ub18eG+ljU96qKRCWh+quCY7yefSmlkQw1ANQ==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
Note that you don’t need to include the entire lodash library to use this method, instead, you can include the method specifically.
There is also a stage 4 proposal for Intl.Segmenter
, which is an API that will allow you to split/segment your string by specifying a granularity. It involves creating a segmenter which can split strings up based on its graphemes (ie: the visual emoji characters). When you use the segmenter on your string, you’ll get an iterator, which you can then convert into an array of characters using Array.from()
:
const graphemeSplit = str => { const segmenter = new Intl.Segmenter("en", {granularity: 'grapheme'}); const segitr = segmenter.segment(str); return Array.from(segitr, ({segment}) => segment); } // See browser console for output console.log(graphemeSplit("π©ββ€οΈβπβπ©")); // ["π©ββ€οΈβπβπ©"] console.log(graphemeSplit("π§πΎπ¨π»π§πΌπ¦π½π§πΏ")); // ["π§πΎ", "π¨π»", "π§πΌ", "π¦π½", "π§πΏ"]