// BMP: Basic Multilingual Plane (U+0000 to U+FFFF) // UTF-16 (16-bit Unicode Transformation Format) is an extension of UCS-2 that allows representing code points outside the BMP. // It produces a variable-length result of either one or two 16-bit code units per code point. // This way, it can encode code points in the range from 0 to 0x10FFFF. (source: https://mathiasbynens.be/notes/javascript-encoding) // "Unicode code points 2^16 and above are represented in JavaScript by two code units, known as a surrogate pair." Effective JS, Herman (29) import stringToCodePointArray from 'string-to-code-point-array' const outsideBMP = '𝌆' const insideBMP = 'a' console.log('string length outside BMP') console.log(outsideBMP.length) // 2 console.log('string length') console.log(insideBMP.length) // 1 console.log('code point array length outside BMP') console.log(stringToCodePointArray(outsideBMP).length) // 1 console.log('code point array length') console.log(stringToCodePointArray(insideBMP).length) // 1