miknoj · September 9, 2023 14:30 · Aug 23, 2023 · Aug 23, 2023 · Aug 23, 2023 · Aug 23, 2023
diff --git a/normcore-llm.md b/normcore-llm.md
@@ -49,6 +49,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
 
 ## Evaluation
 
++ [Interpretable Machine Learning](https://arxiv.org/abs/2103.11251)
 + [Evaluating ChatGPT](https://ehudreiter.com/2023/04/04/evaluating-chatgpt/)
 + [ChatGPT: Jack of All Trades, Master of None](https://github.com/CLARIN-PL/chatgpt-evaluation-01-2023)
 

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -37,6 +37,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
 + [What is ChatGPT doing and why does it work](https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/)
 + [How is LlamaCPP Possible?](https://finbarr.ca/how-is-llama-cpp-possible/)
 + [On Prompt Engineering](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/)
++ [Transformers from Scratch](https://e2eml.school/transformers.html)
 
 ## Deployment 
 

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -55,5 +55,6 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
 ## UX
 
 + [Generative Interfaces Beyond Chat (YouTube)](https://www.youtube.com/watch?v=rd-J3hmycQs)
++ [Why Chatbots are not the Future](https://wattenberger.com/thoughts/boo-chatbots)
 
 Thanks to everyone who added suggestions on [Twitter](https://twitter.com/vboykis/status/1691530859575214081), [Mastodon](https://jawns.club/@vicki/110895263087386568), and Bluesky. 
diff --git a/normcore-llm.md b/normcore-llm.md
@@ -44,6 +44,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
 + [Challenges and Applications of Large Language Models](https://arxiv.org/abs/2307.10169)
 + [All the Hard Stuff Nobody talks about when building products with LLMs ](https://www.honeycomb.io/blog/hard-stuff-nobody-talks-about-llm)
 + [Scaling Kubernetes to run ChatGPT](https://openai.com/research/scaling-kubernetes-to-7500-nodes)
++ [Numbers every LLM Developer should know](https://github.com/ray-project/llm-numbers)
 
 ## Evaluation
 

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -27,6 +27,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
 + [Why host your own LLM?](http://marble.onl/posts/why_host_your_own_llm.html)
 + [How to train your own LLMs](https://blog.replit.com/llm-training)
 + Training [Compute-Optimal Large Language Models](https://arxiv.org/abs/2203.15556)
++ [Opt-175B Logbook](https://github.com/facebookresearch/metaseq/blob/main/projects/OPT/chronicles/OPT175B_Logbook.pdf)
 
 ## Algos
 

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -53,3 +53,5 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
 ## UX
 
 + [Generative Interfaces Beyond Chat (YouTube)](https://www.youtube.com/watch?v=rd-J3hmycQs)
+
+Thanks to everyone who added suggestions on [Twitter](https://twitter.com/vboykis/status/1691530859575214081), [Mastodon](https://jawns.club/@vicki/110895263087386568), and Bluesky. 
diff --git a/normcore-llm.md b/normcore-llm.md
@@ -1,6 +1,6 @@
 # Anti-hype LLM reading list
 
-Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. 
+Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts and experience preferred (super rare at this point).
 
 [My own notes from a few months back.](https://gist.github.com/veekaybee/6f8885e9906aa9c5408ebe5c7e870698) 
 

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -10,6 +10,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
 + [Self-attention and transformer networks](https://sebastianraschka.com/blog/2021/dl-course.html#l19-self-attention-and-transformer-networks) 
 + [What are embeddings](https://vickiboykis.com/what_are_embeddings/)
 + [The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning (YouTube)](https://www.youtube.com/watch?v=ISPId9Lhc1g)
++ [Catching up on the weird world of LLMS](https://simonwillison.net/2023/Aug/3/weird-world-of-llms)
 
 
 ## Foundational Papers

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -34,6 +34,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
 + [The State of GPT (YouTube)](https://www.youtube.com/watch?v=bZQun8Y4L2A)
 + [What is ChatGPT doing and why does it work](https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/)
 + [How is LlamaCPP Possible?](https://finbarr.ca/how-is-llama-cpp-possible/)
++ [On Prompt Engineering](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/)
 
 ## Deployment 
 

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -40,6 +40,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
 + [Building LLM Applications for Production](https://huyenchip.com/2023/04/11/llm-engineering.html)
 + [Challenges and Applications of Large Language Models](https://arxiv.org/abs/2307.10169)
 + [All the Hard Stuff Nobody talks about when building products with LLMs ](https://www.honeycomb.io/blog/hard-stuff-nobody-talks-about-llm)
++ [Scaling Kubernetes to run ChatGPT](https://openai.com/research/scaling-kubernetes-to-7500-nodes)
 
 ## Evaluation
 

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -10,7 +10,6 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
 + [Self-attention and transformer networks](https://sebastianraschka.com/blog/2021/dl-course.html#l19-self-attention-and-transformer-networks) 
 + [What are embeddings](https://vickiboykis.com/what_are_embeddings/)
 + [The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning (YouTube)](https://www.youtube.com/watch?v=ISPId9Lhc1g)
-+ 
 
 
 ## Foundational Papers
@@ -22,6 +21,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
 + [Training Language Models to Follow Instructions](https://arxiv.org/abs/2203.02155)
 + [Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165) 
 
+
 ## Training Your Own
 + [Why host your own LLM?](http://marble.onl/posts/why_host_your_own_llm.html)
 + [How to train your own LLMs](https://blog.replit.com/llm-training)
@@ -33,6 +33,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
 + [Meta Recsys Using and extending Word2Vec](https://engineering.fb.com/2023/08/09/ml-applications/scaling-instagram-explore-recommendations-system)
 + [The State of GPT (YouTube)](https://www.youtube.com/watch?v=bZQun8Y4L2A)
 + [What is ChatGPT doing and why does it work](https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/)
++ [How is LlamaCPP Possible?](https://finbarr.ca/how-is-llama-cpp-possible/)
 
 ## Deployment 
 

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -43,6 +43,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
 ## Evaluation
 
 + [Evaluating ChatGPT](https://ehudreiter.com/2023/04/04/evaluating-chatgpt/)
++ [ChatGPT: Jack of All Trades, Master of None](https://github.com/CLARIN-PL/chatgpt-evaluation-01-2023)
 
 
 ## UX

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -2,12 +2,15 @@
 
 Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. 
 
+[My own notes from a few months back.](https://gist.github.com/veekaybee/6f8885e9906aa9c5408ebe5c7e870698) 
+
 ## Background
 
 + [Survey of LLMS](https://arxiv.org/abs/2303.18223)
 + [Self-attention and transformer networks](https://sebastianraschka.com/blog/2021/dl-course.html#l19-self-attention-and-transformer-networks) 
 + [What are embeddings](https://vickiboykis.com/what_are_embeddings/)
 + [The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning (YouTube)](https://www.youtube.com/watch?v=ISPId9Lhc1g)
++ 
 
 
 ## Foundational Papers
@@ -39,6 +42,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
 
 ## Evaluation
 
++ [Evaluating ChatGPT](https://ehudreiter.com/2023/04/04/evaluating-chatgpt/)
 
 
 ## UX

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -5,13 +5,27 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
 ## Background
 
 + [Survey of LLMS](https://arxiv.org/abs/2303.18223)
++ [Self-attention and transformer networks](https://sebastianraschka.com/blog/2021/dl-course.html#l19-self-attention-and-transformer-networks) 
++ [What are embeddings](https://vickiboykis.com/what_are_embeddings/)
++ [The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning (YouTube)](https://www.youtube.com/watch?v=ISPId9Lhc1g)
+
+
+## Foundational Papers
+
++ [Attention is all you Need](https://arxiv.org/abs/1706.03762)
++ [Scaling Laws for Neural Language Models](https://arxiv.org/abs/2001.08361)
++ [BERT](https://arxiv.org/abs/1810.04805)
++ [Language Models are Unsupervised Multi-Task Learners](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)
++ [Training Language Models to Follow Instructions](https://arxiv.org/abs/2203.02155)
++ [Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165) 
 
 ## Training Your Own
 + [Why host your own LLM?](http://marble.onl/posts/why_host_your_own_llm.html)
 + [How to train your own LLMs](https://blog.replit.com/llm-training)
++ Training [Compute-Optimal Large Language Models](https://arxiv.org/abs/2203.15556)
 
 ## Algos
-+ [What are embeddings](https://vickiboykis.com/what_are_embeddings/)
+
 + [The case for GZIP Classifiers](https://nlpnewsletter.substack.com/p/flashier-attention-gzip-classifiers) and [more on nearest neighbors algos](https://magazine.sebastianraschka.com/p/large-language-models-and-nearest)
 + [Meta Recsys Using and extending Word2Vec](https://engineering.fb.com/2023/08/09/ml-applications/scaling-instagram-explore-recommendations-system)
 + [The State of GPT (YouTube)](https://www.youtube.com/watch?v=bZQun8Y4L2A)

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -1,5 +1,7 @@
 # Anti-hype LLM reading list
 
+Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. 
+
 ## Background
 
 + [Survey of LLMS](https://arxiv.org/abs/2303.18223)

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -13,6 +13,7 @@
 + [The case for GZIP Classifiers](https://nlpnewsletter.substack.com/p/flashier-attention-gzip-classifiers) and [more on nearest neighbors algos](https://magazine.sebastianraschka.com/p/large-language-models-and-nearest)
 + [Meta Recsys Using and extending Word2Vec](https://engineering.fb.com/2023/08/09/ml-applications/scaling-instagram-explore-recommendations-system)
 + [The State of GPT (YouTube)](https://www.youtube.com/watch?v=bZQun8Y4L2A)
++ [What is ChatGPT doing and why does it work](https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/)
 
 ## Deployment 
 

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -12,13 +12,13 @@
 + [What are embeddings](https://vickiboykis.com/what_are_embeddings/)
 + [The case for GZIP Classifiers](https://nlpnewsletter.substack.com/p/flashier-attention-gzip-classifiers) and [more on nearest neighbors algos](https://magazine.sebastianraschka.com/p/large-language-models-and-nearest)
 + [Meta Recsys Using and extending Word2Vec](https://engineering.fb.com/2023/08/09/ml-applications/scaling-instagram-explore-recommendations-system)
-+ [The State of GPT](https://www.youtube.com/watch?v=bZQun8Y4L2A)
++ [The State of GPT (YouTube)](https://www.youtube.com/watch?v=bZQun8Y4L2A)
 
 ## Deployment 
 
 + [Building LLM Applications for Production](https://huyenchip.com/2023/04/11/llm-engineering.html)
 + [Challenges and Applications of Large Language Models](https://arxiv.org/abs/2307.10169)
-+ All the Hard Stuff Nobody talks about when [building products with LLMs (YouTube)](https://www.honeycomb.io/blog/hard-stuff-nobody-talks-about-llm)
++ [All the Hard Stuff Nobody talks about when building products with LLMs ](https://www.honeycomb.io/blog/hard-stuff-nobody-talks-about-llm)
 
 ## Evaluation
 

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -9,13 +9,20 @@
 + [How to train your own LLMs](https://blog.replit.com/llm-training)
 
 ## Algos
++ [What are embeddings](https://vickiboykis.com/what_are_embeddings/)
 + [The case for GZIP Classifiers](https://nlpnewsletter.substack.com/p/flashier-attention-gzip-classifiers) and [more on nearest neighbors algos](https://magazine.sebastianraschka.com/p/large-language-models-and-nearest)
 + [Meta Recsys Using and extending Word2Vec](https://engineering.fb.com/2023/08/09/ml-applications/scaling-instagram-explore-recommendations-system)
++ [The State of GPT](https://www.youtube.com/watch?v=bZQun8Y4L2A)
 
 ## Deployment 
 
 + [Building LLM Applications for Production](https://huyenchip.com/2023/04/11/llm-engineering.html)
 + [Challenges and Applications of Large Language Models](https://arxiv.org/abs/2307.10169)
++ All the Hard Stuff Nobody talks about when [building products with LLMs (YouTube)](https://www.honeycomb.io/blog/hard-stuff-nobody-talks-about-llm)
+
+## Evaluation
+
+
 
 ## UX
 

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -1,5 +1,9 @@
 # Anti-hype LLM reading list
 
+## Background
+
++ [Survey of LLMS](https://arxiv.org/abs/2303.18223)
+
 ## Training Your Own
 + [Why host your own LLM?](http://marble.onl/posts/why_host_your_own_llm.html)
 + [How to train your own LLMs](https://blog.replit.com/llm-training)

diff --git a/normcore-llm.md b/normcore-llm.md
@@ -0,0 +1,18 @@
+# Anti-hype LLM reading list
+
+## Training Your Own
++ [Why host your own LLM?](http://marble.onl/posts/why_host_your_own_llm.html)
++ [How to train your own LLMs](https://blog.replit.com/llm-training)
+
+## Algos
++ [The case for GZIP Classifiers](https://nlpnewsletter.substack.com/p/flashier-attention-gzip-classifiers) and [more on nearest neighbors algos](https://magazine.sebastianraschka.com/p/large-language-models-and-nearest)
++ [Meta Recsys Using and extending Word2Vec](https://engineering.fb.com/2023/08/09/ml-applications/scaling-instagram-explore-recommendations-system)
+
+## Deployment 
+
++ [Building LLM Applications for Production](https://huyenchip.com/2023/04/11/llm-engineering.html)
++ [Challenges and Applications of Large Language Models](https://arxiv.org/abs/2307.10169)
+
+## UX
+
++ [Generative Interfaces Beyond Chat (YouTube)](https://www.youtube.com/watch?v=rd-J3hmycQs)