Skip to content

Instantly share code, notes, and snippets.

@standy66
Created August 2, 2017 14:27
Show Gist options
  • Select an option

  • Save standy66/22d7adfec647f07fe25903e4bd97263d to your computer and use it in GitHub Desktop.

Select an option

Save standy66/22d7adfec647f07fe25903e4bd97263d to your computer and use it in GitHub Desktop.

Revisions

  1. standy66 created this gist Aug 2, 2017.
    13 changes: 13 additions & 0 deletions pymystem3_example.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,13 @@
    import sys, re, pymystem3

    mystem = pymystem3.Mystem()

    def stem(s):
    return [(e['text'].strip(), \
    e['analysis'][0]['lex'] \
    if 'analysis' in e and len(e['analysis']) > 0 else '', \
    re.match('^([A-Z]+)', e['analysis'][0]['gr']).group(0) \
    if 'analysis' in e and len(e['analysis']) > 0 else '', \
    ','.join(set(re.findall(r"[\w']+", e['analysis'][0]['gr'])[1:])) \
    if 'analysis' in e and len(e['analysis']) > 0 else '')\
    for e in mystem.analyze(s) if len(e['text'].strip()) > 0]