Skip to content

Instantly share code, notes, and snippets.

@amansrivastava17
Created November 12, 2019 07:15
Show Gist options
  • Save amansrivastava17/f39f9d809827694d7c9a772b30af4efb to your computer and use it in GitHub Desktop.
Save amansrivastava17/f39f9d809827694d7c9a772b30af4efb to your computer and use it in GitHub Desktop.
sentences = [re.sub(r'.,:?{}', ' ', sentence) for sentence in sentences]
corpus = " ".join(sentences)
words = set(doc.split())
word_index = {word: index for index, word in enumerate(words)}
with open( 'word_index.json' , 'w' ) as file:
json.dump( word_index , file )
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment