loftwah · September 3, 2025 14:18 · Sep 3, 2025 · Sep 3, 2025 · Sep 3, 2025 · Sep 3, 2025
diff --git a/grabitsh.md b/grabitsh.md
@@ -1,7 +1,9 @@
 # grabit – Repo Index & Q/A Tool (2025 Plan)
 
 ## Overview
+
 `grabit` is a self-contained Go binary that:
+
 - Indexes a local folder or shallow-cloned GitHub repo
 - Chunks source files into manageable blocks
 - Embeds those blocks using OpenAI embeddings
@@ -14,93 +16,98 @@ This gives you a local “chat with your codebase” workflow that’s portable,
 ---
 
 ## Requirements
+
 - Go 1.22+
 - `git` (only if using `--repo` mode for shallow clone)
-- Environment variable:  
+- Environment variable:
   - `OPENAI_API_KEY` (required)
 
 Optional:
+
 - `GITHUB_TOKEN` (higher API rate limits if you extend for GitHub API usage)
 
 ---
 
 ## Default Models (2025)
-- **Responses / Q&A**:  
-  - `gpt-4o` (balanced), or  
+
+- **Responses / Q&A**:
+  - `gpt-4o` (balanced), or
   - `o3-mini` (stronger reasoning, higher cost)
-- **Embeddings**:  
-  - `text-embedding-3-large` (best recall, 3072-dim)  
+- **Embeddings**:
+  - `text-embedding-3-large` (best recall, 3072-dim)
   - `text-embedding-3-small` (cheaper, 1536-dim, optional)
 
 ---
 
 ## CLI Commands
+
 ```bash
 grabit index --path .          # Index current repo
 grabit index --repo https://github.com/org/repo
 grabit search "rate limiter"   # Semantic search
 grabit ask "How does auth work?"  # Ask with RAG
 grabit map                     # Show file types & index stats
-````
+```
 
 ---
 
 ## Pipeline
 
 ### 1. File Discovery
 
-* Walk repo
-* Include extensions: `.go`, `.rb`, `.py`, `.ts`, `.js`, `.java`, `.c`, `.cpp`, `.rs`, `.sql`, `.yaml`, `.json`, etc.
-* Exclude dirs: `.git`, `node_modules`, `dist`, `build`, `venv`, etc.
-* Skip >5MB files
+- Walk repo
+- Include extensions: `.go`, `.rb`, `.py`, `.ts`, `.js`, `.java`, `.c`, `.cpp`, `.rs`, `.sql`, `.yaml`, `.json`, etc.
+- Exclude dirs: `.git`, `node_modules`, `dist`, `build`, `venv`, etc.
+- Skip >5MB files
 
 ### 2. Chunking
 
-* \~700 bytes per chunk
-* \~120 byte overlap
-* Tracks file + line ranges for citations
+- \~700 bytes per chunk
+- \~120 byte overlap
+- Tracks file + line ranges for citations
 
 ### 3. Embedding
 
-* Use `text-embedding-3-large`
-* Batch API calls (default 64 per batch)
-* Save vectors alongside text chunks
+- Use `text-embedding-3-large`
+- Batch API calls (default 64 per batch)
+- Save vectors alongside text chunks
 
 ### 4. Storage
 
-* `.grabit/index.jsonl` → newline JSON (chunks + vectors)
-* `.grabit/meta.json` → model + repo metadata
+- `.grabit/index.jsonl` → newline JSON (chunks + vectors)
+- `.grabit/meta.json` → model + repo metadata
 
 ### 5. Retrieval
 
-* Query → embed → cosine similarity over chunks
-* Top-K (default 12) returned
+- Query → embed → cosine similarity over chunks
+- Top-K (default 12) returned
 
 ### 6. Prompt Assembly
 
-* Include selected chunks in context:
+- Include selected chunks in context:
 
-  * `FILE: path  LINES: start-end`
-  * Snippet text
-* Guard against max token budget
-* Instruction: “Answer strictly from context; if not present, say so”
+  - `FILE: path  LINES: start-end`
+  - Snippet text
+
+- Guard against max token budget
+- Instruction: “Answer strictly from context; if not present, say so”
 
 ### 7. Answer
 
-* Responses API (`/v1/responses`)
-* Model = `gpt-4o` (default, overridable via `--model`)
+- Responses API (`/v1/responses`)
+- Model = `gpt-4o` (default, overridable via `--model`)
 
 ---
 
 ## Roadmap / Enhancements
 
-* Parallel embedding worker pool
-* Smarter chunking (AST-based for Go/Ruby)
-* Hybrid retrieval (keyword + embedding)
-* SQLite w/ `sqlite-vec` for faster search
-* Citations in answers (file + line refs)
-* Streaming responses for better UX
-* RAG eval harness (measure groundedness)
+- Parallel embedding worker pool
+- Smarter chunking (AST-based for Go/Ruby)
+- Hybrid retrieval (keyword + embedding)
+- SQLite w/ `sqlite-vec` for faster search
+- Citations in answers (file + line refs)
+- Streaming responses for better UX
+- RAG eval harness (measure groundedness)
 
 ---
 
@@ -112,6 +119,7 @@ grabit map                     # Show file types & index stats
    ```bash
    export OPENAI_API_KEY=sk-...
    ```
+
 3. Runs `grabit index --path .`
 4. Runs `grabit ask "Where are rate limits enforced?"`
 5. Gets answer + context citations
@@ -120,7 +128,7 @@ grabit map                     # Show file types & index stats
 
 ## Notes
 
-* Designed to be portable: single binary, no extra DB
-* OpenAI key is the only dependency
-* Extensible: can later plug into FAISS, pgvector, or SQLite
-* Works offline after indexing except for OpenAI calls
+- Designed to be portable: single binary, no extra DB
+- OpenAI key is the only dependency
+- Extensible: can later plug into FAISS, pgvector, or SQLite
+- Works offline after indexing except for OpenAI calls
diff --git a/grabitsh.md b/grabitsh.md
@@ -1 +1,126 @@
-‎‎
+# grabit – Repo Index & Q/A Tool (2025 Plan)
+
+## Overview
+`grabit` is a self-contained Go binary that:
+- Indexes a local folder or shallow-cloned GitHub repo
+- Chunks source files into manageable blocks
+- Embeds those blocks using OpenAI embeddings
+- Stores an on-disk index (`.grabit/`)
+- Lets you run **semantic search** or **ask natural language questions** against the repo
+- Answers are powered by OpenAI’s **Responses API** with Retrieval-Augmented Generation (RAG)
+
+This gives you a local “chat with your codebase” workflow that’s portable, fast, and easy to run.
+
+---
+
+## Requirements
+- Go 1.22+
+- `git` (only if using `--repo` mode for shallow clone)
+- Environment variable:  
+  - `OPENAI_API_KEY` (required)
+
+Optional:
+- `GITHUB_TOKEN` (higher API rate limits if you extend for GitHub API usage)
+
+---
+
+## Default Models (2025)
+- **Responses / Q&A**:  
+  - `gpt-4o` (balanced), or  
+  - `o3-mini` (stronger reasoning, higher cost)
+- **Embeddings**:  
+  - `text-embedding-3-large` (best recall, 3072-dim)  
+  - `text-embedding-3-small` (cheaper, 1536-dim, optional)
+
+---
+
+## CLI Commands
+```bash
+grabit index --path .          # Index current repo
+grabit index --repo https://github.com/org/repo
+grabit search "rate limiter"   # Semantic search
+grabit ask "How does auth work?"  # Ask with RAG
+grabit map                     # Show file types & index stats
+````
+
+---
+
+## Pipeline
+
+### 1. File Discovery
+
+* Walk repo
+* Include extensions: `.go`, `.rb`, `.py`, `.ts`, `.js`, `.java`, `.c`, `.cpp`, `.rs`, `.sql`, `.yaml`, `.json`, etc.
+* Exclude dirs: `.git`, `node_modules`, `dist`, `build`, `venv`, etc.
+* Skip >5MB files
+
+### 2. Chunking
+
+* \~700 bytes per chunk
+* \~120 byte overlap
+* Tracks file + line ranges for citations
+
+### 3. Embedding
+
+* Use `text-embedding-3-large`
+* Batch API calls (default 64 per batch)
+* Save vectors alongside text chunks
+
+### 4. Storage
+
+* `.grabit/index.jsonl` → newline JSON (chunks + vectors)
+* `.grabit/meta.json` → model + repo metadata
+
+### 5. Retrieval
+
+* Query → embed → cosine similarity over chunks
+* Top-K (default 12) returned
+
+### 6. Prompt Assembly
+
+* Include selected chunks in context:
+
+  * `FILE: path  LINES: start-end`
+  * Snippet text
+* Guard against max token budget
+* Instruction: “Answer strictly from context; if not present, say so”
+
+### 7. Answer
+
+* Responses API (`/v1/responses`)
+* Model = `gpt-4o` (default, overridable via `--model`)
+
+---
+
+## Roadmap / Enhancements
+
+* Parallel embedding worker pool
+* Smarter chunking (AST-based for Go/Ruby)
+* Hybrid retrieval (keyword + embedding)
+* SQLite w/ `sqlite-vec` for faster search
+* Citations in answers (file + line refs)
+* Streaming responses for better UX
+* RAG eval harness (measure groundedness)
+
+---
+
+## Usage Pattern
+
+1. User installs binary
+2. Exports API key:
+
+   ```bash
+   export OPENAI_API_KEY=sk-...
+   ```
+3. Runs `grabit index --path .`
+4. Runs `grabit ask "Where are rate limits enforced?"`
+5. Gets answer + context citations
+
+---
+
+## Notes
+
+* Designed to be portable: single binary, no extra DB
+* OpenAI key is the only dependency
+* Extensible: can later plug into FAISS, pgvector, or SQLite
+* Works offline after indexing except for OpenAI calls
diff --git a/grabitsh.md b/grabitsh.md
@@ -0,0 +1 @@
+‎‎