I put a commented version that explains what this does at the bottom
- Install http://brew.sh/
- In Terminal: brew install poppler hunspell
- mkdir ~/Dictionaries
- Download dictionaries from https://cgit.freedesktop.org/libreoffice/dictionaries/tree/en and put them in there. Use the "plain" link. You need both the .dic and the .aff files.
- pdftotext pattern.pdf - | tr -s '[:blank:][:punct:]' '\n' | awk 'length($1) >= 2 && length($1) <= 5 { print $1 }' | hunspell -d ~/Dictionaries/en_US -a | awk '{print $1,$2}' | grep -v '^[\*@]' | tr -d '&#' | sort | uniq -c