Forked from bhavikngala/fast_ai_mooc_important_points.md
Created
March 4, 2019 16:31
-
-
Save raivivek/a402c1d0a6074034d4c33f76b1d66de9 to your computer and use it in GitHub Desktop.
This gist contains a list of important points from fast.ai "practical deep learning for coders" and "cutting edge deep learning for coders" MOOC
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| This gist contains contains a list of points I found very useful while going the fast.ai "Practical deep learning for coders" and "Cutting edge deep learning for coders" MOOC by Jeremy Howard and team. This list may not be complete as I watched the video at 1.5x speed on marathon but I did write down as much things I found to be very useful to get a model working. A fair warning the points are in no particular order, you may everything on python, CNN, NLP etc all jumbled up. | |
| Before beggining I want to thank Jeremy Howard, Rachel Thomas, and the entire fast.ai team in making this awesome practically oriented MOOC. | |
| 1. Progressive image res training: Train the network on lower res first and then increase the resolution to get better performance. This can be thought of as transfer learning from the same dataset but at a different resolution. There is one paper by NVIDIA as well that used such an approach to train GANs. | |
| 2. Cyclical learning rates: Gradually increasing the learning rate initially helps to avoid getting stuck in saddle points and explore entire(or more areas) of the loss landscape. | |
| 3. To reduce memory usage you can use lower precision floating points i.e. float16 instead of float32. | |
| 4. Self supervised learning - labels are inbuilt in data. | |
| 5. For NLP tasks other that language models, you can use language model for tranfer learning i.e. first train the model to be a language model and then add the actually functionality. | |
| 5. When using transfer learning for NLP, in language model you can and should use entire dataset i.e. train and test datasets. | |
| 6. Discriminative learning rates: use different learning rates for different layer groups in your network. | |
| 7. Random forests can be used to find optimal hyper parameters. | |
| 8. Use embeddings for categorical variables. | |
| 9. For missing values - replace them with the median of the variable and add a new column of boolean variable saying missing=True/False. | |
| 10. Wherever possible use transfer learning, it always increases performance. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment