manisnesan · July 6, 2019 16:49 · Sep 27, 2016
diff --git a/Bag of Tricks for Efficient Text Classification.md b/Bag of Tricks for Efficient Text Classification.md
@@ -0,0 +1,41 @@
+# Bag of Tricks for Efficient Text Classification
+
+## Introduction
+
+* Introduces fastText, a simple and highly efficient approach for text classification.
+* At par with deep learning models in terms of accuracy though an order of magnitude faster in performance. 
+* [Link to the paper](http://arxiv.org/abs/1607.01759v3)
+* [Link to code](https://github.com/facebookresearch/fastText)
+
+## Architecture
+
+* Built on top of linear models with a rank constraint and a fast loss approximation.
+* Start with word representations that are averaged into text representation and feed them to a linear classifier.
+* Think of text representation as a hidden state that can be shared among features and classes.
+* Softmax layer to obtain a probability distribution over pre-defined classes.
+* High computational complexity *O(kh)*, *k* is the number of classes and *h* is dimension of text representation.
+
+### Hierarchial Softmax
+
+* Based on Huffman Coding Tree
+* Used to reduce complexity to *O(hlog(k))*
+* Top T results (from the tree) can be computed efficiently *O(logT)* using a binary heap.
+
+### N-gram Features
+
+* Instead of explicitly using word order, uses a bag of n-grams to maintain efficiency without losing on accuracy.
+* Uses [hashing trick](https://arxiv.org/pdf/0902.2206.pdf) to maintain fast and memory efficient mapping of the n-grams.
+
+## Experiments
+
+### Sentiment Analysis
+
+* fastText benefits by using bigrams.
+* Outperforms [char-CNN](http://arxiv.org/abs/1502.01710v5) and [char-CRNN](http://arxiv.org/abs/1602.00367v1) and performs a bit worse than [VDCNN](http://arxiv.org/abs/1606.01781v1).
+* Order of magnitudes faster in terms of training time.
+* Note: fastText does not use pre-trained word embeddings.
+
+### Tag Prediction
+
+* fastText with bigrams outperforms [Tagspace](http://emnlp2014.org/papers/pdf/EMNLP2014194.pdf).
+* fastText performs upto 600 times faster at test time.