Skip to content

Instantly share code, notes, and snippets.

@zoellner
Created May 28, 2020 20:37
Show Gist options
  • Save zoellner/4af04a5a8b51f04ad653e26d3b7181ec to your computer and use it in GitHub Desktop.
Save zoellner/4af04a5a8b51f04ad653e26d3b7181ec to your computer and use it in GitHub Desktop.

Revisions

  1. zoellner created this gist May 28, 2020.
    24 changes: 24 additions & 0 deletions utf16test.js
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,24 @@
    const fs = require('fs');

    // our demo string is in 'default' utf8 with emoji character assuming you are using an editor that supports those.
    // if not, you can test this gist by setting utf8string to the equivalent '->\ud83d\ude03\ud83e\uddd2\ud83c\udffc\u00fc\u010d\u0113<-'
    // gist doesn't support all ZWJ sequences, so can't show this here but it works with those as well, e.g. '\ud83d\udc68\ud83c\udffc\u200d\ud83d\udcbb'
    const utf8string = '->😃🧒🏼üčē<-';

    // this is what you'd usually do to write to a utf-8 encoded file
    fs.writeFileSync('test-utf8.txt', utf8string);

    // sometimes you need to write utf-16/ucs-2 encoded files.
    // Node has built in support for utf16le (Little Endian) in fs.write but if you just use fs.write(filename, utf8string, 'utf16le') the written file
    // won't work since it is missing the Byte Order Mark (BOM) character (https://en.wikipedia.org/wiki/Byte_order_mark)
    // So you need to add that character (feff in hex) yourself before writing the String/Buffer to a file.
    const utf16buffer = Buffer.from(`\ufeff${utf8string}`, 'utf16le');
    fs.writeFileSync('test-utf16le.txt', utf16buffer);

    // And if you need to write utf-16be (Big Endian) files, all you need to do for that is to swap the bytes in the utf16 buffer
    // this will end up with fffe as first two bytes in the final file which will tell supporting decoders that the byte order is Big Endian
    fs.writeFileSync('test-utf16be.txt', utf16buffer.swap16());


    // you can see what the content of these files is by using the od command line tool (mac/linux):
    // od -x test-utf16le.txt