Skip to content

Instantly share code, notes, and snippets.

@miknoj
Forked from veekaybee/normcore-llm.md
Created September 9, 2023 14:30
Show Gist options
  • Save miknoj/5b649d9eb3b3bf172dffc91a966a9ecb to your computer and use it in GitHub Desktop.
Save miknoj/5b649d9eb3b3bf172dffc91a966a9ecb to your computer and use it in GitHub Desktop.

Revisions

  1. @veekaybee veekaybee revised this gist Aug 23, 2023. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -49,6 +49,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N

    ## Evaluation

    + [Interpretable Machine Learning](https://arxiv.org/abs/2103.11251)
    + [Evaluating ChatGPT](https://ehudreiter.com/2023/04/04/evaluating-chatgpt/)
    + [ChatGPT: Jack of All Trades, Master of None](https://github.com/CLARIN-PL/chatgpt-evaluation-01-2023)

  2. @veekaybee veekaybee revised this gist Aug 23, 2023. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -37,6 +37,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
    + [What is ChatGPT doing and why does it work](https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/)
    + [How is LlamaCPP Possible?](https://finbarr.ca/how-is-llama-cpp-possible/)
    + [On Prompt Engineering](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/)
    + [Transformers from Scratch](https://e2eml.school/transformers.html)

    ## Deployment

  3. @veekaybee veekaybee revised this gist Aug 23, 2023. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -55,5 +55,6 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
    ## UX

    + [Generative Interfaces Beyond Chat (YouTube)](https://www.youtube.com/watch?v=rd-J3hmycQs)
    + [Why Chatbots are not the Future](https://wattenberger.com/thoughts/boo-chatbots)

    Thanks to everyone who added suggestions on [Twitter](https://twitter.com/vboykis/status/1691530859575214081), [Mastodon](https://jawns.club/@vicki/110895263087386568), and Bluesky.
  4. @veekaybee veekaybee revised this gist Aug 23, 2023. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -44,6 +44,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
    + [Challenges and Applications of Large Language Models](https://arxiv.org/abs/2307.10169)
    + [All the Hard Stuff Nobody talks about when building products with LLMs ](https://www.honeycomb.io/blog/hard-stuff-nobody-talks-about-llm)
    + [Scaling Kubernetes to run ChatGPT](https://openai.com/research/scaling-kubernetes-to-7500-nodes)
    + [Numbers every LLM Developer should know](https://github.com/ray-project/llm-numbers)

    ## Evaluation

  5. @veekaybee veekaybee revised this gist Aug 17, 2023. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -27,6 +27,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
    + [Why host your own LLM?](http://marble.onl/posts/why_host_your_own_llm.html)
    + [How to train your own LLMs](https://blog.replit.com/llm-training)
    + Training [Compute-Optimal Large Language Models](https://arxiv.org/abs/2203.15556)
    + [Opt-175B Logbook](https://github.com/facebookresearch/metaseq/blob/main/projects/OPT/chronicles/OPT175B_Logbook.pdf)

    ## Algos

  6. @veekaybee veekaybee revised this gist Aug 17, 2023. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -53,3 +53,5 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
    ## UX

    + [Generative Interfaces Beyond Chat (YouTube)](https://www.youtube.com/watch?v=rd-J3hmycQs)

    Thanks to everyone who added suggestions on [Twitter](https://twitter.com/vboykis/status/1691530859575214081), [Mastodon](https://jawns.club/@vicki/110895263087386568), and Bluesky.
  7. @veekaybee veekaybee revised this gist Aug 17, 2023. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -1,6 +1,6 @@
    # Anti-hype LLM reading list

    Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible.
    Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts and experience preferred (super rare at this point).

    [My own notes from a few months back.](https://gist.github.com/veekaybee/6f8885e9906aa9c5408ebe5c7e870698)

  8. @veekaybee veekaybee revised this gist Aug 17, 2023. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -10,6 +10,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
    + [Self-attention and transformer networks](https://sebastianraschka.com/blog/2021/dl-course.html#l19-self-attention-and-transformer-networks)
    + [What are embeddings](https://vickiboykis.com/what_are_embeddings/)
    + [The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning (YouTube)](https://www.youtube.com/watch?v=ISPId9Lhc1g)
    + [Catching up on the weird world of LLMS](https://simonwillison.net/2023/Aug/3/weird-world-of-llms)


    ## Foundational Papers
  9. @veekaybee veekaybee revised this gist Aug 17, 2023. No changes.
  10. @veekaybee veekaybee revised this gist Aug 17, 2023. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -34,6 +34,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
    + [The State of GPT (YouTube)](https://www.youtube.com/watch?v=bZQun8Y4L2A)
    + [What is ChatGPT doing and why does it work](https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/)
    + [How is LlamaCPP Possible?](https://finbarr.ca/how-is-llama-cpp-possible/)
    + [On Prompt Engineering](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/)

    ## Deployment

  11. @veekaybee veekaybee revised this gist Aug 17, 2023. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -40,6 +40,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
    + [Building LLM Applications for Production](https://huyenchip.com/2023/04/11/llm-engineering.html)
    + [Challenges and Applications of Large Language Models](https://arxiv.org/abs/2307.10169)
    + [All the Hard Stuff Nobody talks about when building products with LLMs ](https://www.honeycomb.io/blog/hard-stuff-nobody-talks-about-llm)
    + [Scaling Kubernetes to run ChatGPT](https://openai.com/research/scaling-kubernetes-to-7500-nodes)

    ## Evaluation

  12. @veekaybee veekaybee revised this gist Aug 17, 2023. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -10,7 +10,6 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
    + [Self-attention and transformer networks](https://sebastianraschka.com/blog/2021/dl-course.html#l19-self-attention-and-transformer-networks)
    + [What are embeddings](https://vickiboykis.com/what_are_embeddings/)
    + [The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning (YouTube)](https://www.youtube.com/watch?v=ISPId9Lhc1g)
    +


    ## Foundational Papers
    @@ -22,6 +21,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
    + [Training Language Models to Follow Instructions](https://arxiv.org/abs/2203.02155)
    + [Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165)


    ## Training Your Own
    + [Why host your own LLM?](http://marble.onl/posts/why_host_your_own_llm.html)
    + [How to train your own LLMs](https://blog.replit.com/llm-training)
    @@ -33,6 +33,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
    + [Meta Recsys Using and extending Word2Vec](https://engineering.fb.com/2023/08/09/ml-applications/scaling-instagram-explore-recommendations-system)
    + [The State of GPT (YouTube)](https://www.youtube.com/watch?v=bZQun8Y4L2A)
    + [What is ChatGPT doing and why does it work](https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/)
    + [How is LlamaCPP Possible?](https://finbarr.ca/how-is-llama-cpp-possible/)

    ## Deployment

  13. @veekaybee veekaybee revised this gist Aug 17, 2023. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -43,6 +43,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
    ## Evaluation

    + [Evaluating ChatGPT](https://ehudreiter.com/2023/04/04/evaluating-chatgpt/)
    + [ChatGPT: Jack of All Trades, Master of None](https://github.com/CLARIN-PL/chatgpt-evaluation-01-2023)


    ## UX
  14. @veekaybee veekaybee revised this gist Aug 17, 2023. No changes.
  15. @veekaybee veekaybee revised this gist Aug 17, 2023. 1 changed file with 4 additions and 0 deletions.
    4 changes: 4 additions & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -2,12 +2,15 @@

    Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible.

    [My own notes from a few months back.](https://gist.github.com/veekaybee/6f8885e9906aa9c5408ebe5c7e870698)

    ## Background

    + [Survey of LLMS](https://arxiv.org/abs/2303.18223)
    + [Self-attention and transformer networks](https://sebastianraschka.com/blog/2021/dl-course.html#l19-self-attention-and-transformer-networks)
    + [What are embeddings](https://vickiboykis.com/what_are_embeddings/)
    + [The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning (YouTube)](https://www.youtube.com/watch?v=ISPId9Lhc1g)
    +


    ## Foundational Papers
    @@ -39,6 +42,7 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N

    ## Evaluation

    + [Evaluating ChatGPT](https://ehudreiter.com/2023/04/04/evaluating-chatgpt/)


    ## UX
  16. @veekaybee veekaybee revised this gist Aug 17, 2023. No changes.
  17. @veekaybee veekaybee revised this gist Aug 17, 2023. 1 changed file with 15 additions and 1 deletion.
    16 changes: 15 additions & 1 deletion normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -5,13 +5,27 @@ Goals: Add links that are reasonable and good explanations of how stuff works. N
    ## Background

    + [Survey of LLMS](https://arxiv.org/abs/2303.18223)
    + [Self-attention and transformer networks](https://sebastianraschka.com/blog/2021/dl-course.html#l19-self-attention-and-transformer-networks)
    + [What are embeddings](https://vickiboykis.com/what_are_embeddings/)
    + [The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning (YouTube)](https://www.youtube.com/watch?v=ISPId9Lhc1g)


    ## Foundational Papers

    + [Attention is all you Need](https://arxiv.org/abs/1706.03762)
    + [Scaling Laws for Neural Language Models](https://arxiv.org/abs/2001.08361)
    + [BERT](https://arxiv.org/abs/1810.04805)
    + [Language Models are Unsupervised Multi-Task Learners](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)
    + [Training Language Models to Follow Instructions](https://arxiv.org/abs/2203.02155)
    + [Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165)

    ## Training Your Own
    + [Why host your own LLM?](http://marble.onl/posts/why_host_your_own_llm.html)
    + [How to train your own LLMs](https://blog.replit.com/llm-training)
    + Training [Compute-Optimal Large Language Models](https://arxiv.org/abs/2203.15556)

    ## Algos
    + [What are embeddings](https://vickiboykis.com/what_are_embeddings/)

    + [The case for GZIP Classifiers](https://nlpnewsletter.substack.com/p/flashier-attention-gzip-classifiers) and [more on nearest neighbors algos](https://magazine.sebastianraschka.com/p/large-language-models-and-nearest)
    + [Meta Recsys Using and extending Word2Vec](https://engineering.fb.com/2023/08/09/ml-applications/scaling-instagram-explore-recommendations-system)
    + [The State of GPT (YouTube)](https://www.youtube.com/watch?v=bZQun8Y4L2A)
  18. @veekaybee veekaybee revised this gist Aug 17, 2023. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -1,5 +1,7 @@
    # Anti-hype LLM reading list

    Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible.

    ## Background

    + [Survey of LLMS](https://arxiv.org/abs/2303.18223)
  19. @veekaybee veekaybee revised this gist Aug 17, 2023. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -13,6 +13,7 @@
    + [The case for GZIP Classifiers](https://nlpnewsletter.substack.com/p/flashier-attention-gzip-classifiers) and [more on nearest neighbors algos](https://magazine.sebastianraschka.com/p/large-language-models-and-nearest)
    + [Meta Recsys Using and extending Word2Vec](https://engineering.fb.com/2023/08/09/ml-applications/scaling-instagram-explore-recommendations-system)
    + [The State of GPT (YouTube)](https://www.youtube.com/watch?v=bZQun8Y4L2A)
    + [What is ChatGPT doing and why does it work](https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/)

    ## Deployment

  20. @veekaybee veekaybee revised this gist Aug 17, 2023. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -12,13 +12,13 @@
    + [What are embeddings](https://vickiboykis.com/what_are_embeddings/)
    + [The case for GZIP Classifiers](https://nlpnewsletter.substack.com/p/flashier-attention-gzip-classifiers) and [more on nearest neighbors algos](https://magazine.sebastianraschka.com/p/large-language-models-and-nearest)
    + [Meta Recsys Using and extending Word2Vec](https://engineering.fb.com/2023/08/09/ml-applications/scaling-instagram-explore-recommendations-system)
    + [The State of GPT](https://www.youtube.com/watch?v=bZQun8Y4L2A)
    + [The State of GPT (YouTube)](https://www.youtube.com/watch?v=bZQun8Y4L2A)

    ## Deployment

    + [Building LLM Applications for Production](https://huyenchip.com/2023/04/11/llm-engineering.html)
    + [Challenges and Applications of Large Language Models](https://arxiv.org/abs/2307.10169)
    + All the Hard Stuff Nobody talks about when [building products with LLMs (YouTube)](https://www.honeycomb.io/blog/hard-stuff-nobody-talks-about-llm)
    + [All the Hard Stuff Nobody talks about when building products with LLMs ](https://www.honeycomb.io/blog/hard-stuff-nobody-talks-about-llm)

    ## Evaluation

  21. @veekaybee veekaybee revised this gist Aug 17, 2023. 1 changed file with 7 additions and 0 deletions.
    7 changes: 7 additions & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -9,13 +9,20 @@
    + [How to train your own LLMs](https://blog.replit.com/llm-training)

    ## Algos
    + [What are embeddings](https://vickiboykis.com/what_are_embeddings/)
    + [The case for GZIP Classifiers](https://nlpnewsletter.substack.com/p/flashier-attention-gzip-classifiers) and [more on nearest neighbors algos](https://magazine.sebastianraschka.com/p/large-language-models-and-nearest)
    + [Meta Recsys Using and extending Word2Vec](https://engineering.fb.com/2023/08/09/ml-applications/scaling-instagram-explore-recommendations-system)
    + [The State of GPT](https://www.youtube.com/watch?v=bZQun8Y4L2A)

    ## Deployment

    + [Building LLM Applications for Production](https://huyenchip.com/2023/04/11/llm-engineering.html)
    + [Challenges and Applications of Large Language Models](https://arxiv.org/abs/2307.10169)
    + All the Hard Stuff Nobody talks about when [building products with LLMs (YouTube)](https://www.honeycomb.io/blog/hard-stuff-nobody-talks-about-llm)

    ## Evaluation



    ## UX

  22. @veekaybee veekaybee revised this gist Aug 15, 2023. 1 changed file with 4 additions and 0 deletions.
    4 changes: 4 additions & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -1,5 +1,9 @@
    # Anti-hype LLM reading list

    ## Background

    + [Survey of LLMS](https://arxiv.org/abs/2303.18223)

    ## Training Your Own
    + [Why host your own LLM?](http://marble.onl/posts/why_host_your_own_llm.html)
    + [How to train your own LLMs](https://blog.replit.com/llm-training)
  23. @veekaybee veekaybee created this gist Aug 15, 2023.
    18 changes: 18 additions & 0 deletions normcore-llm.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,18 @@
    # Anti-hype LLM reading list

    ## Training Your Own
    + [Why host your own LLM?](http://marble.onl/posts/why_host_your_own_llm.html)
    + [How to train your own LLMs](https://blog.replit.com/llm-training)

    ## Algos
    + [The case for GZIP Classifiers](https://nlpnewsletter.substack.com/p/flashier-attention-gzip-classifiers) and [more on nearest neighbors algos](https://magazine.sebastianraschka.com/p/large-language-models-and-nearest)
    + [Meta Recsys Using and extending Word2Vec](https://engineering.fb.com/2023/08/09/ml-applications/scaling-instagram-explore-recommendations-system)

    ## Deployment

    + [Building LLM Applications for Production](https://huyenchip.com/2023/04/11/llm-engineering.html)
    + [Challenges and Applications of Large Language Models](https://arxiv.org/abs/2307.10169)

    ## UX

    + [Generative Interfaces Beyond Chat (YouTube)](https://www.youtube.com/watch?v=rd-J3hmycQs)