Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.
| /** | |
| * My Portable RAG | |
| * $ pnpm add sqlite-vec @ai-sdk/google ai | |
| * SQLite Vector Search + Google AI Embeddings | |
| * | |
| * Required environment variables: | |
| * GOOGLE_GENERATIVE_AI_API_KEY=your-api-key | |
| * | |
| * Usage: | |
| * # Index text content |
| export const requireComment = { | |
| meta: { | |
| type: "suggestion", | |
| docs: { | |
| description: "useEffectにはコメントでの説明が必須です。", | |
| }, | |
| schema: [], | |
| messages: { | |
| requireCommentOnUseEffect: `useEffectにはコメントでの説明が必須です。 |
| """ | |
| A minimal, fast example generating text with Llama 3.1 in MLX. | |
| To run, install the requirements: | |
| pip install -U mlx transformers fire | |
| Then generate text with: | |
| python l3min.py "How tall is K2?" |
| """QA Chatbot streaming using FastAPI, LangChain Expression Language , OpenAI, and Chroma. | |
| Features | |
| -------- | |
| - Persistent Chat Memory: | |
| Stores chat history in a local file. | |
| - Persistent Vector Store: | |
| Stores document embeddings in a local vector store. | |
| - Standalone Question Generation: | |
| Rephrases follow-up questions to standalone questions in their original language. |
| #!/bin/bash | |
| SCRIPTNAME=$(basename "$0") | |
| function realpath () { | |
| f=$@; | |
| if [ -d "$f" ]; then | |
| base=""; | |
| dir="$f"; | |
| else | |
| base="/$(basename "$f")"; |
This worked on 14/May/23. The instructions will probably require updating in the future.
llama is a text prediction model similar to GPT-2, and the version of GPT-3 that has not been fine tuned yet. It is also possible to run fine tuned versions (like alpaca or vicuna with this. I think. Those versions are more focused on answering questions)
Note: I have been told that this does not support multiple GPUs. It can only use a single GPU.
It is possible to run LLama 13B with a 6GB graphics card now! (e.g. a RTX 2060). Thanks to the amazing work involved in llama.cpp. The latest change is CUDA/cuBLAS which allows you pick an arbitrary number of the transformer layers to be run on the GPU. This is perfect for low VRAM.
08737ef720f0510c7ec2aa84d7f70c691073c35d.# follow `ruby.wasm` tutorial
curl -LO https://github.com/ruby/ruby.wasm/releases/latest/download/ruby-head-wasm32-unknown-wasi-full.tar.gz| import { Client } from '../../../src/middleware/client' | |
| import type { AppType } from './server' | |
| const client = new Client<AppType>('http://127.0.0.1:8787/api') | |
| const res = await client.json('/posts', { | |
| id: 123, | |
| title: 'hello', | |
| }) |
| const isUseEffect = (node) => node.callee.name === 'useEffect'; | |
| const argumentIsArrowFunction = (node) => node.arguments[0].type === 'ArrowFunctionExpression'; | |
| const effectBodyIsSingleFunction = (node) => { | |
| const { body } = node.arguments[0]; | |
| // It's a single unwrapped function call: | |
| // `useEffect(() => theNameOfAFunction(), []);` | |
| if (body.type === 'CallExpression') { |