Skip to content

Instantly share code, notes, and snippets.

@Ima8
Forked from dtinth/cut.js
Created July 18, 2019 09:51
Show Gist options
  • Save Ima8/4e01e49e59974feda20a8c459a70be87 to your computer and use it in GitHub Desktop.
Save Ima8/4e01e49e59974feda20a8c459a70be87 to your computer and use it in GitHub Desktop.
Thai word cut in Chrome
// Note: Using non-standard V8 feature
// https://code.google.com/archive/p/v8-i18n/wikis/BreakIterator.wiki
//
// The standard is now Intl.Segmenter but no browser implements it yet.
//
function cut(text) {
const iterator = new Intl.v8BreakIterator(["th"]);
iterator.adoptText(text);
const result = [];
let pos = iterator.first();
while (pos !== -1) {
let nextPos = iterator.next();
if (nextPos === -1) break;
result.push(text.slice(pos, nextPos));
pos = nextPos;
}
return result
}
it('cuts word', () => {
expect(cut('ตัดคำภาษาไทย')).toEqual(["ตัด", "คำ", "ภาษา", "ไทย"])
})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment