Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.
- The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning (YouTube)
- Transformers as Support Vector Machines
- Survey of LLMS
- Deep Learning Systems
- Fundamental ML Reading List
- What are embeddings
- Concepts from Operating Systems that Found their way into LLMS
- Talking about Large Language Models
- Language Modeling is Compression
- Vector Search - Long-Term Memory in AI
- Eight things to know about large language models
- BERT
- Seq2Seq
- Attention is all you Need
- Scaling Laws for Neural Language Models
- Language Models are Unsupervised Multi-Task Learners
- Training Language Models to Follow Instructions
- Language Models are Few-Shot Learners
- Transformers from Scratch
- Transformer Math
- Five Years of GPT Progress
- Lost in the Middle: How Language Models Use Long Contexts
- Self-attention and transformer networks
- Attention
- Understanding and Coding the Attention Mechanism
- Attention Mechanisms
- Keys, Queries, and Values
- What is ChatGPT doing and why does it work
- My own notes from a few months back.
- Karpathy's The State of GPT (YouTube)
- How open are open architectures?
- Catching up on the weird world of LLMS
- Building an LLM from Scratch
- Large Language Models in 2023 and Slides
- Why host your own LLM?
- How to train your own LLMs
- Hugging Face Resources on Training Your Own
- Training Compute-Optimal Large Language Models
- Opt-175B Logbook
- The Complete Guide to LLM Fine-tuning
- LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language - Really great overview of SOTA fine-tuning techniques
- A Gentle Introduction to 8-bit matrix multiplication
- Motivation for Parameter-Efficient Fine-tuning
- Which Quantization Method is Right for You?
- Fine-tuning with LoRA and QLoRA
- Fine-tuning RedPajama on Slack Data
- How is LlamaCPP Possible?
- How to beat GPT-4 with a 13-B Model
- Efficient LLM Inference on CPUs
- Tiny Language Models Come of Age
- Efficiency LLM Spectrum
- TinyML at MIT
- Building LLM Applications for Production
- Challenges and Applications of Large Language Models
- All the Hard Stuff Nobody talks about when building products with LLMs
- Scaling Kubernetes to run ChatGPT
- Numbers every LLM Developer should know
- Against LLM Maximalism
- A Guide to Inference and Performance
- (InThe)WildChat: 570K ChatGPT Interaction Logs In The Wild
- LLM Inference Performance Engineering: Best Practices
- The State of Production LLMs in 2023
- Machine Learning Engineering for successful training of large language models and multi-modal models.
- The Best GPUS for Deep Learning 2023
- Making Deep Learning Go Brr from First Principles
- Everything about Distributed Training and Efficient Finetuning
- Training LLMs at Scale with AMD MI250 GPUs
- GPU Programming
- Evaluating ChatGPT
- ChatGPT: Jack of All Trades, Master of None
- What's Going on with the Open LLM Leaderboard
- Challenges in Evaluating AI Systems
- LLM Evaluation Papers
- Evaluating LLMs is a MineField
- Generative Interfaces Beyond Chat (YouTube)
- Why Chatbots are not the Future
- The Future of Search is Boutique
- As a Large Language Model, I
- Natural Language is an Unnatural Interface
Thanks to everyone who added suggestions on Twitter, Mastodon, and Bluesky.