steipete · October 18, 2025 21:31 · Oct 7, 2025 · Oct 7, 2025 · Sep 30, 2025 · Sep 30, 2025
diff --git a/swagent-spec.md b/swagent-spec.md
@@ -1,5 +1,5 @@
 build swagent swift cli and test it yourself until it works.
-use this development key for testing with openai API: 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
+use this development key for testing with openai API: <enter key>
 
 ---
 

diff --git a/swagent-spec.md b/swagent-spec.md
@@ -1,4 +1,5 @@
-Below is a clean, workshop‑ready guide for **swagent**, split into three parts as requested.
+build swagent swift cli and test it yourself until it works.
+use this development key for testing with openai API: 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
 
 ---
 

diff --git a/swagent-spec.md b/swagent-spec.md
@@ -1,33 +1,24 @@
-Nice—this feedback is gold. I baked **every missing specificity** into your stage prompts and the **one‑go build** so the agent has zero room to guess.
+Below is a clean, workshop‑ready guide for **swagent**, split into three parts as requested.
 
-Below you’ll find:
-
-* a **Responses API “contract”** (exact payloads, shapes, key casing)
-* a tighter **system prompt** (macOS 26, self‑testing, behavior when unsure)
-* **stage acceptance checks** that force the model to actually **call `run_bash`** and verify
-* an updated **one‑go build brief** with the full system prompt + a copy‑paste **crib sheet**
+---
 
-All references point to the official docs so agents don’t revert to legacy Chat Completions. ([OpenAI Platform][1])
+## 1) Docs — the exact contract (Responses API, tools, streaming, chaining)
 
----
+**Endpoints**
 
-## 🔒 Hardened API contract (copy/paste into your spec)
+* Create/continue a response: `POST https://api.openai.com/v1/responses`
+  Headers: `Authorization: Bearer $OPENAI_API_KEY`, `Content-Type: application/json`. ([OpenAI Platform][1])
 
-**Endpoint**
+**Core request fields**
 
-```http
-POST https://api.openai.com/v1/responses
-Authorization: Bearer $OPENAI_API_KEY
-Content-Type: application/json
-```
+* `model`: `"gpt-5-codex"`.
+* `instructions`: your system rules (string). Re‑send them on **every** turn.
+* `input`: string **or** an array of **items** (e.g., user message, function call outputs).
+* `store: true` if you’ll chain turns later with `previous_response_id`. ([OpenAI Platform][1])
 
-**Required fields & casing**
+**Tools (function calling)**
 
-* `model` (e.g., `"gpt-5-codex"`)
-* `instructions` (system‑like rules, string)
-* `input` (the user turn, string or array of items—string is fine here)
-* `store: true` if you plan to chain with `previous_response_id`
-* **Tools (function calling)** go in `tools` as **top‑level** objects:
+* Send tools as **top‑level** objects in `tools` with this shape:
 
   ```json
   {
@@ -38,328 +29,343 @@ Content-Type: application/json
       "type": "object",
       "properties": {
         "command": { "type": "string" },
-        "cwd":     { "type": "string" }
+        "cwd": { "type": "string" }
       },
       "required": ["command"]
     }
   }
   ```
+* You can let the model choose with `"tool_choice": "auto"`. ([OpenAI Platform][2])
 
-  > **Do not** use the old nested `function: { name, ... }` shape from Assistants. Responses uses **top‑level** `name/description/parameters`. ([OpenAI Platform][2])
-
-**Conversation state**
-
-* To continue a conversation **without** resending the whole transcript, pass `previous_response_id` on the next call.
-* **Important:** `instructions` are **not carried over** with `previous_response_id`; **re‑send** your system prompt each turn. ([OpenAI Platform][3])
+**Function‑call loop (no `tool_outputs` param)**
 
-**Usage metrics (footer numbers)**
+1. First call: model may return **items** of `type: "function_call"` in `output` with `call_id`, `name`, and `arguments` (JSON string).
+2. Run the tool locally.
+3. Continue the run by calling `POST /v1/responses` **again** with:
 
-* Current casing: `usage.input_tokens`, `usage.output_tokens`, `usage.total_tokens` (snake_case). Use these for the per‑turn stats footer. ([OpenAI Platform][4])
+   * `previous_response_id`: the prior response `id`
+   * `instructions`: the same system rules
+   * `input`: an **array of items**, each
 
-**Output structure (what you’ll parse)**
+     ```json
+     {
+       "type": "function_call_output",
+       "call_id": "<same id>",
+       "output": "<stringified JSON like { stdout, stderr, exitCode }>"
+     }
+     ```
 
-* `output` is an **array of items**. Expect **assistant messages** and possible **function calls**:
+This is how you return tool results. Don’t send a top‑level `tool_outputs` field. ([OpenAI Platform][3])
 
-  * Assistant text lives in a message item’s `content` as `{ "type": "output_text", "text": "..." }`. ([OpenAI Platform][3])
-  * Function calls appear as items of type **`function_call`** with:
+**Streaming (SSE)**
 
-    ```json
-    { "type": "function_call", "call_id": "call_…", "name": "run_bash", "arguments": "{\"command\":\"swift build\"}" }
-    ```
+* Set `"stream": true` to get **Server‑Sent Events** while the model is thinking. You’ll receive events such as:
 
-    `arguments` is a **JSON string**. ([OpenAI Platform][5])
-* To **return tool results**, create a follow‑up `responses.create` with:
+  * `response.created` (start)
+  * `response.output_text.delta` (text chunks)
+  * `response.function_call.delta` (incremental function args)
+  * `response.completed` (final object, includes `usage`)
+    Handle errors via `response.error`. ([OpenAI Platform][4])
 
-  * the same `model`
-  * `previous_response_id: "<id from the prior response>"`
-  * `tool_outputs: [ { "call_id": "<call id>", "output": "<stringified JSON like { stdout, stderr, exitCode }>" } ]`
-  * This yields a new response; continue until the model produces a final message or asks for more info. ([OpenAI Platform][6])
+**Usage & token counters**
 
-**Minimal end‑to‑end example**
+* Use `usage.input_tokens`, `usage.output_tokens`, `usage.total_tokens` (snake_case) to print per‑turn stats. These arrive on the final response (or `response.completed` in streaming). ([OpenAI Platform][5])
 
-*Request (turn 1):*
-
-```json
-{
-  "model": "gpt-5-codex",
-  "instructions": "…swagent system rules…",
-  "input": "Init a Swift package and build it.",
-  "tools": [
-    { "type":"function", "name":"run_bash",
-      "description":"Run bash", "parameters":{
-        "type":"object","properties":{
-          "command":{"type":"string"}, "cwd":{"type":"string"}
-        },"required":["command"]
-      }},
-    { "type":"function","name":"request_more_info",
-      "parameters":{"type":"object","properties":{"question":{"type":"string"}},"required":["question"]}},
-    { "type":"function","name":"finish",
-      "parameters":{"type":"object","properties":{"summary":{"type":"string"}},"required":["summary"]}}
-  ],
-  "tool_choice": "auto",
-  "store": true
-}
-```
+**Conversation state**
 
-*Response (turn 1 → includes a function call):*
+* To continue a chat **without** resending past text, set `previous_response_id` and re‑send your `instructions`. You may also pass prior output items explicitly if you need. ([OpenAI Platform][6])
 
-```json
-{
-  "id": "resp_123",
-  "output": [
-    {
-      "type": "message",
-      "role": "assistant",
-      "content": [ { "type": "output_text", "text": "Creating the package, then building…" } ]
-    },
-    {
-      "type": "function_call",
-      "call_id": "call_abc",
-      "name": "run_bash",
-      "arguments": "{\"command\":\"swift package init --type executable\"}"
-    }
-  ],
-  "usage": { "input_tokens": 395, "output_tokens": 57, "total_tokens": 452 }
-}
-```
+**Progress signal taxonomy (what to show in the CLI)**
 
-*Request (turn 2 → return the tool output):*
+* Before the first output token: **“🧠 thinking…”** (spinner) once you receive `response.created`.
+* While streaming text: live print each `response.output_text.delta`.
+* When the model starts a tool: **“🔧 run_bash …”** as soon as you see `response.function_call.delta` / the final `function_call` item.
+* While executing the tool: **“⏳ running command…”** until you post the `function_call_output` and the model resumes.
+* On finalization: **“✅ done”** once `response.completed` arrives, then print the footer with `usage`. ([OpenAI Platform][4])
 
-```json
-{
-  "model": "gpt-5-codex",
-  "instructions": "…swagent system rules…",
-  "previous_response_id": "resp_123",
-  "tool_outputs": [
-    {
-      "call_id": "call_abc",
-      "output": "{\"stdout\":\"initialized…\",\"stderr\":\"\",\"exitCode\":0}"
-    }
-  ]
-}
-```
+---
 
-*Response (turn 2 → may include another function call or final text):*
+## 2) Full instructions — build the whole CLI in one pass
 
-```json
-{
-  "id": "resp_124",
-  "output": [
-    { "type":"function_call", "call_id":"call_def", "name":"run_bash",
-      "arguments":"{\"command\":\"swift build\"}" }
-  ],
-  "usage": { "input_tokens": 117, "output_tokens": 12, "total_tokens": 129 }
-}
-```
+**Project:** `swagent`
+**Language/Tooling:** Swift 6.2 with `SwiftSetting.defaultIsolation(MainActor.self)` enabled via SPM; dependencies: `swift-argument-parser`, `apple/swift-configuration`; built‑in **Swift Testing** (Xcode 26); add **swift-format** and **SwiftLint**.
+**Model/API:** OpenAI **Responses API**, `model: gpt-5-codex`, streaming **on**. ([OpenAI Platform][1])
 
-…and so on until the model emits a message (with `output_text`) or calls `finish`. ([OpenAI Platform][1])
+**Startup UX**
 
-**Key deltas vs Chat Completions (so agents don’t regress)**
+* Print **2–3 cheeky lines** (random) and the **masked API key** (first 3 + last 4).
+* Examples:
 
-* `input` + `instructions` instead of `messages`
-* tool calls are **items** (type `function_call`) with `call_id` + `arguments` string
-* tool **results** are sent via `tool_outputs` (not “tool messages”)
-* conversation uses `previous_response_id` (not an array of past messages)
-  Docs walk through each difference. ([OpenAI Platform][5])
+  * “🎩 I code therefore I am.”
+  * “⚡ One prompt. One shot. Make it count.”
+  * “🔧 Small diffs, big wins.”
+  * “🧪 If it compiles, we ship. Mostly.”
+  * “🐚 Bashful? I’m not.”
 
----
+**Flags**
 
-## 🧠 Behavior policy (no guesswork)
+* `-v, --verbose` — extra logs (HTTP status, timings).
+* `--version` — print version.
+* `-p <prompt>` — one‑shot user interaction; **internally** the agent may loop via tools until `finish` or it needs info.
+* `--yolo` — auto‑approve all shell commands (no interactive Y/n).
+* `--session <uuid>` — load a persisted session.
 
-* **Default stance:** if confident, **answer & act**; if not, call `request_more_info(question)` with one precise question.
-* **Self‑testing is mandatory:** after proposing any command, **call `run_bash`** (post‑approval / `--yolo`) and verify results. If a step fails, iterate until fixed or ask for info.
-* **Live calls please:** run the **real** API end‑to‑end (we’ll provide an `OPENAI_API_KEY`). No stubs during the workshop.
-  Function‑calling & tool loop details: see Function Calling + Responses guides. ([OpenAI Platform][7])
+**Commands**
 
----
+* `/new` or `/clear` — reset conversation state.
+* `/status` — show masked key, token totals this session, **estimated** remaining context.
+* `/exit` — quit; print:
+  *“To resume this session, call `swagent --session <uuid>`.”*
 
-## 🧾 System prompt (final, paste verbatim)
+**System prompt (embed verbatim in `instructions` every turn)**
 
 > **You are swagent**, a coding agent for terminal workflows.
 > **Runtime:** **macOS 26 or later**.
 > **Mission:** Build, run, and refine code + shell workflows; verify your work.
 > **Behavior:**
 >
 > * Think step‑by‑step; prefer small diffs and working patches.
-> * When you propose commands, you **must** call `run_bash` to execute them (after user approval) and confirm results.
-> * If blocked, call `request_more_info(question)` with one precise, answerable question.
+> * When you propose commands, **call `run_bash`** to execute them; **never** ask the user to confirm (the CLI handles approvals).
+> * If the runtime says **yolo=true**, treat commands as pre‑approved and run immediately.
+> * If **yolo=false** and a command is destructive/ambiguous, call `request_more_info(question)` once; otherwise, just `run_bash`.
 > * When done, call `finish(summary)` with a concise summary + next steps.
-> * Don’t exfiltrate secrets; avoid destructive commands unless asked.
-> * Output stays terminal‑friendly and concise.
+> * Keep output terminal‑friendly and concise; never print secrets.
 >   **Tools:**
 >
-> 1. `run_bash(command: string, cwd?: string)` → return `{stdout, stderr, exitCode}`.
+> 1. `run_bash(command: string, cwd?: string)` → returns `{stdout, stderr, exitCode}`.
 > 2. `request_more_info(question: string)`
 > 3. `finish(summary: string)`
->    **API rules (Responses API):**
+>    **Responses API rules:**
 >
 > * Use `model: gpt-5-codex`.
 > * Re‑send these instructions every turn.
-> * Chain turns with `previous_response_id`.
-> * Tools are defined with top‑level `name/description/parameters` (JSON Schema).
-> * Tool calls arrive as `function_call` items with a `call_id`; return results using `tool_outputs` with the **same** `call_id`.
-> * Read `usage.input_tokens`, `usage.output_tokens`, `usage.total_tokens` for stats. ([OpenAI Platform][1])
-
----
-
-## 🧪 Stage plan (only the deltas that changed, with **self‑test** baked in)
+> * Chain with `previous_response_id`.
+> * Tools are top‑level `{ type:'function', name, description, parameters }`.
+> * Tool calls arrive as `output` items of `type:'function_call'` with a `call_id`. **Return results** by continuing with `previous_response_id` and sending `input: [{ "type":"function_call_output", "call_id":"<same>", "output":"<stringified JSON>" }]`.
+> * Read `usage.input_tokens`, `usage.output_tokens`, `usage.total_tokens` for per‑turn stats.
+>   **[swagent runtime]**
+>   `yolo=true|false` • `verbose=true|false` • `session=<uuid>` • `cwd=<path>` ([OpenAI Platform][2])
 
-### Stage 1 — minimal one‑shot
+**Runtime header**
+Append the `[swagent runtime]` block above to `instructions` every turn (so the agent knows about `yolo`, etc.). ([OpenAI Platform][6])
 
-* Startup: print **2–3 cheeky lines** + masked key (`sk‑abc…wxyz`).
-* One call to Responses; print reply + footer `(in/out/total, time)`.
-* **Checks (must pass)**
+**Tooling & policies**
 
-  * `swagent -v "ping"` → shows cheeky lines, masked key, HTTP 200 log, model text, usage footer.
-  * Missing key → single‑line error.
-  * **Usage footer uses snake_case fields** from `usage`. ([OpenAI Platform][4])
+* **Bash tool**: Implement `run_bash(command, cwd?)` via `bash -lc`. By default, prompt `Run? [Y/n]` (Enter=Yes). With `--yolo`, auto‑approve. Return `{stdout, stderr, exitCode}` (JSON), but **stringify** it before sending as `function_call_output`.
+* **Ask‑for‑info tool**: `request_more_info(question)` prints the question and waits for a one‑line user reply; forward that as the next turn’s user message (you can co‑send alongside tool outputs in `input`).
+* **Finish tool**: `finish(summary)` prints the summary and ends the current action (stay in REPL unless in `-p` mode).
+* **Self‑testing**: After code changes, the agent **must** call `run_bash` to run `swift build` (and `swift test` if tests exist), and also self‑invoke the CLI (`swift run swagent …`) to verify flags.
 
-### Stage 2 — sticky chat (`-p` still one‑shot)
+**Streaming & progress**
 
-* Re‑send system prompt every turn; chain with `previous_response_id`.
-* **Checks**
+* Always set `"stream": true` when calling `/v1/responses`. Show:
 
-  * REPL: second turn uses the first turn’s `previous_response_id`.
-  * `/new` clears chain; next call has **no** `previous_response_id`. ([OpenAI Platform][8])
+  * **Thinking spinner** after `response.created` until first `response.output_text.delta`.
+  * **Live text streaming** by writing each `delta` chunk immediately.
+  * **Tool call progress** when you see a `function_call` (or its deltas): print the command preview; switch to **“⏳ running…”** while executing; resume streaming once you send `function_call_output`.
+  * **Footer** on `response.completed` using `usage.*` and a monotonic timer.
+    Event names and flow: see Responses streaming & Realtime guides. ([OpenAI Platform][4])
 
-### Stage 3 — agent signals
+**Sessions**
 
-* Add `finish(summary)` and `request_more_info(question)`; implement tool loop.
-* **Checks**
+* Persist under `~/.swagent/<uuid>.json` via an `actor`.
+* Save: `previous_response_id`, chain of response ids, per‑session token totals, timestamps.
+* `--session <uuid>` loads and continues from file.
 
-  * Prompt: “Ask me one clarifying question, then summarize and finish.”
-    → Model calls `request_more_info` → you answer → model calls `finish` → summary printed + stats.
-  * Verify: each function call got a **matching** `tool_outputs` entry with the same `call_id`. ([OpenAI Platform][5])
+**Config**
 
-### Stage 4 — bash tool + guardrails
+* Use `swift-configuration` to read `OPENAI_API_KEY` from the environment; mask it as `sk‑abc…wxyz` on startup. ([OpenAI Platform][2])
 
-* Tool: `run_bash(command, cwd?)`; Y/n prompt (Enter = Yes); `--yolo` auto‑approves.
-* Inherit env so the agent can run `swift build` and `swift run` with your real key available.
-* **Checks (self‑test required)**
+**Testing, format, lint**
 
-  * `swagent -p "Create hello.sh, chmod +x, run it"` → agent actually **calls `run_bash`** and shows `stdout: hello`.
-  * `swagent --yolo -p "swift --version"` → auto‑runs, returns output, then `finish`.
-  * Each tool call returned a **tool output** with the same `call_id`. ([OpenAI Platform][7])
+* Use **Swift Testing** (built‑in with Xcode 26) for unit tests.
+* Add `swift-format` + `SwiftLint` targets/scripts.
 
-### Stage 5 — sessions + `/status` + tests
+**Security**
 
-* Persist `~/.swagent/<uuid>.json` (actor, async file I/O).
-* `/status`: masked key; `usage` totals; **estimated** remaining context.
-* `--session <uuid>` resumes; on exit: “To resume this session, call `swagent --session <uuid>`.”
-* Tests with **Swift Testing** (assume built‑in in your Xcode 26 setup).
-* **Checks**
-
-  * Two turns → `/status` shows snake_case usage fields; `/exit` writes JSON with `previous_response_id`.
-  * `--session <uuid>` resumes and chains from file’s `previous_response_id`. ([OpenAI Platform][3])
-
----
+* Never echo secrets.
+* Treat dangerous commands conservatively when `yolo=false` (use `request_more_info`).
 
-## 🚀 One‑go build brief (give this to the model)
-
-> **Project:** `swagent`
-> **Env:** Swift 6.2 (SPM `.defaultIsolation(MainActor.self)`), macOS host
-> **Deps:** `swift-argument-parser`, `apple/swift-configuration`, built‑in **Swift Testing**, plus **swift-format** and **SwiftLint**
-> **System prompt:** *(paste the “System prompt (final)” above verbatim)*
-> **Startup UX:** print 2–3 cheeky lines (random) + masked API key (first 3 + last 4).
-> **Flags:** `-v/--verbose`, `--version`, `-p <prompt>` (one‑shot UI; internal tool loop), `--yolo`, `--session <uuid>`.
-> **Chat & sessions:** interactive REPL with `/new` `/clear` `/status` `/exit`; chain with `previous_response_id` + `store:true`; **always include** the system prompt each turn; persist under `~/.swagent/<uuid>.json`.
-> **Tools (function calling):**
->
-> * `run_bash(command, cwd?)` → prompt Y/n unless `--yolo`; execute with `bash -lc`; return `{stdout, stderr, exitCode}` (stringified JSON) via `tool_outputs` bound to the **same `call_id`**.
-> * `request_more_info(question)` → ask the user, then continue.
-> * `finish(summary)` → end and print summary.
->   **Per‑turn stats:** print `(in: N, out: M, total: T tokens, 0m 00s)` from `usage`.
->   **Self‑test (mandatory):**
->
-> 1. After generating code, **call `run_bash`** to run `swift build`.
-> 2. If tests exist, **call `run_bash`** to run `swift test`.
-> 3. **Call `run_bash`** to run:
->
->    * `swift run swagent --version`
->    * `swift run swagent -p "Echo hello"`
->    * `swift run swagent --yolo -p "Create hello.sh and run it"`
-> 4. On any failure, inspect `stderr`, fix, and retry.
->    **Responses API (use exactly this shape):** tools are **top‑level** (`type/name/description/parameters`); tool calls arrive as `function_call` items with `call_id` + `arguments` (JSON string); tool results go in `tool_outputs`; usage fields are `input_tokens`, `output_tokens`, `total_tokens`. Use `previous_response_id` for chaining and **re‑send** the system prompt each turn.
->    **Docs:** API Reference (Responses), Function Calling, Conversation State / previous_response_id, streaming/output_text, migration notes. ([OpenAI Platform][1])
-
----
+**Minimal JSON crib sheet (copy/paste)**
 
-## 📎 Responses API crib sheet (drop right into your repo/readme)
-
-**Request (with tools):**
+*Create (turn 1, with tools & streaming):*
 
 ```json
 {
   "model": "gpt-5-codex",
-  "instructions": "<SYSTEM PROMPT HERE>",
-  "input": "<USER INPUT HERE>",
-  "tools": [
-    { "type":"function","name":"run_bash","description":"Run bash","parameters":{
-      "type":"object","properties":{"command":{"type":"string"},"cwd":{"type":"string"}},
-      "required":["command"]
-    }},
-    { "type":"function","name":"request_more_info","parameters":{
-      "type":"object","properties":{"question":{"type":"string"}},"required":["question"]
-    }},
-    { "type":"function","name":"finish","parameters":{
-      "type":"object","properties":{"summary":{"type":"string"}},"required":["summary"]
-    }}
-  ],
+  "instructions": "<SYSTEM PROMPT + [swagent runtime]>",
+  "input": "Create a Swift package and build it.",
+  "tools": [ { "type":"function","name":"run_bash","description":"Run bash","parameters":{
+    "type":"object","properties":{"command":{"type":"string"},"cwd":{"type":"string"}},
+    "required":["command"]
+  }}, { "type":"function","name":"request_more_info","parameters":{
+    "type":"object","properties":{"question":{"type":"string"}},"required":["question"]
+  }}, { "type":"function","name":"finish","parameters":{
+    "type":"object","properties":{"summary":{"type":"string"}},"required":["summary"]
+  }} ],
   "tool_choice": "auto",
-  "store": true
+  "store": true,
+  "stream": true
 }
 ```
 
-**Response (snippet):**
+*Continue (turn 2, return tool result):*
 
 ```json
 {
-  "id": "resp_abc",
-  "output": [
-    { "type":"message", "role":"assistant",
-      "content":[{ "type":"output_text", "text":"…" }] },
-    { "type":"function_call", "call_id":"call_123",
-      "name":"run_bash", "arguments":"{\"command\":\"swift build\"}" }
+  "model": "gpt-5-codex",
+  "instructions": "<SYSTEM PROMPT + [swagent runtime]>",
+  "previous_response_id": "resp_123",
+  "input": [
+    {
+      "type": "function_call_output",
+      "call_id": "call_abc",
+      "output": "{\"stdout\":\"initialized…\",\"stderr\":\"\",\"exitCode\":0}"
+    }
   ],
-  "usage": { "input_tokens": 123, "output_tokens": 45, "total_tokens": 168 }
+  "stream": true
 }
 ```
 
-**Continue with tool output:**
+Docs: Responses create, streaming events, migration guide (function_call_output), usage counters, conversation state. ([OpenAI Platform][1])
 
-```json
-{
-  "model": "gpt-5-codex",
-  "instructions": "<SYSTEM PROMPT HERE>",
-  "previous_response_id": "resp_abc",
-  "tool_outputs": [
-    { "call_id":"call_123",
-      "output":"{\"stdout\":\"…\",\"stderr\":\"\",\"exitCode\":0}" }
-  ]
-}
-```
+---
+
+## 3) Step‑by‑step — 5 tiny stages (each 7–12 minutes), with streaming & checks
+
+### Stage 1 — Minimal one‑shot + streaming
+
+**Build**
+
+* SPM executable target; enable `SwiftSetting.defaultIsolation(MainActor.self)` in `swiftSettings`.
+* Deps: `swift-argument-parser`, `swift-configuration`.
+* Implement a single **Responses** call with `"stream": true`; stream `response.output_text.delta` to stdout.
+* Startup prints **2–3 cheeky lines** + masked key.
+* Flags: `--version`, `-v`.
+
+**Checks**
+
+* `swagent --version` → prints version only.
+* `swagent -v "Ping"` → shows cheeky lines, masked key, streams text live, then footer `(in: X, out: Y, total: Z, 0m 00s)` from `usage`.
+  Streaming/usage: see docs. ([OpenAI Platform][4])
+* No key → clear single‑line error.
+
+---
+
+### Stage 2 — Sticky chat (REPL), `-p` one‑shot, runtime header
+
+**Build**
+
+* Interactive REPL; keep `-p` for one‑shot.
+* Maintain state via `previous_response_id` + `store:true`.
+* Always re‑send `instructions` and attach a `[swagent runtime]` header with `yolo`, `verbose`, `session`, `cwd`.
+
+**Checks**
+
+* Second user turn uses the first turn’s `previous_response_id` (verify in logs if `-v`).
+* `/new` clears state (next call has no `previous_response_id`).
+* Streaming remains active in both REPL and `-p`.
+  Chaining: see conversation state docs. ([OpenAI Platform][6])
 
-Refs for each bit: Responses API reference, `previous_response_id`, function‑calling loop, `output_text`, `usage` fields. ([OpenAI Platform][1])
+---
+
+### Stage 3 — Agent signals (finish / request_more_info), loop via `function_call_output`
+
+**Build**
+
+* Add two tools:
+
+  * `finish(summary: string)`
+  * `request_more_info(question: string)`
+* Implement the function‑call loop:
+
+  * Parse any `function_call` items.
+  * For `request_more_info`, print the question and wait for input; continue by sending a user message item in `input` (you can send it alongside any `function_call_output` items).
+  * For `finish`, print the summary and stop the action.
+
+**Checks**
+
+* Prompt: “Ask me one clarifying question, then summarize and finish.”
+  → Model calls `request_more_info` → collects answer → model calls `finish` → summary printed + footer.
+* Confirm there’s **no** top‑level `tool_outputs`; only `input` items with `type:"function_call_output"` on continuations. ([OpenAI Platform][3])
 
 ---
 
-## 💬 Startup cheeky lines (pool)
+### Stage 4 — Bash tool (guardrails), self‑testing, yolo awareness
+
+**Build**
+
+* Add `run_bash(command, cwd?)`:
+
+  * Default approval: `Run? [Y/n]` (Enter=Yes).
+  * `--yolo`: auto‑approve.
+  * Execute via `bash -lc`; capture `{stdout, stderr, exitCode}`; **stringify** as the `output` field in `function_call_output`.
+* **System prompt** and runtime header explicitly say: agent **never** asks for permission; `yolo=true` means pre‑approved.
+* After code changes, agent **must** self‑test: `swift build`, optional `swift test`, then `swift run swagent …`.
 
-* “🎩 I code therefore I am.”
-* “⚡ One prompt. One shot. Make it count.”
-* “🧪 If it compiles, we ship. Mostly.”
-* “🐚 Bashful? I’m not.”
-* “🔧 Small diffs, big wins.”
+**Checks**
+
+* `swagent --yolo -p "Echo hello"` → model calls `run_bash("echo hello")` immediately (no extra prompt), CLI runs, continuation sends `function_call_output`, finalizes with a reply + footer.
+* `swagent -p "Echo hello"` (non‑yolo) → agent still **does not** ask; CLI prompts Y/n; run completes.
+* Tool loop uses `previous_response_id` + `input` items, streaming on. ([OpenAI Platform][4])
 
 ---
 
-Want me to fold these exact blocks into your Stage 1–5 “paste‑to‑build” prompts so you can run the workshop straight from slides?
+### Stage 5 — Sessions, `/status`, tests, format/lint
+
+**Build**
+
+* Persist sessions under `~/.swagent/<uuid>.json` using an `actor`.
+* `/status` prints: masked key; per‑session token totals; **estimated** context left (model limit minus running total).
+* On exit: *“To resume this session, call `swagent --session <uuid>`.”*
+* Tests with **Swift Testing** for:
+
+  * Arg parsing (`-v`, `--version`, `-p`, `--yolo`, `--session`).
+  * Session store save/load roundtrip (concurrent writes protected by actor).
+  * Tool approval logic (Y/n default vs `--yolo`).
+* Add `swift-format` and `SwiftLint` targets (`make fmt`, `make lint`, `make check`).
+
+**Checks**
+
+* Two turns, then `/status` shows totals; `/exit` persists a JSON containing the latest `previous_response_id`, cumulative `usage`, timestamps.
+* `--session <uuid>` resumes and continues chaining.
+* `make check` runs format, lint, and tests cleanly.
+
+---
+
+### Minimal streaming cURL (for the slides)
+
+```bash
+curl https://api.openai.com/v1/responses \
+  -H "Authorization: Bearer $OPENAI_API_KEY" \
+  -H "Content-Type: application/json" \
+  -N \
+  -d '{
+    "model": "gpt-5-codex",
+    "instructions": "…system prompt…",
+    "input": "Say hello, slowly.",
+    "stream": true
+  }'
+# Expect SSE events like: response.created, response.output_text.delta, response.completed
+```
+
+SSE event names and flow: Responses streaming docs (plus Realtime guide for event taxonomy). ([OpenAI Platform][4])
+
+---
+
+**References**
+
+* Responses API — create & tools. ([OpenAI Platform][1])
+* Streaming — SSE events for Responses. ([OpenAI Platform][4])
+* Conversation state — `previous_response_id`. ([OpenAI Platform][6])
+* Migration guide — `function_call_output` items. ([OpenAI Platform][3])
+* Usage counters (snake_case). ([OpenAI Platform][5])
+
+Want a tiny Swift snippet that shows parsing SSE lines and switching the UI between “🧠 thinking…”, streaming text, and tool execution?
 
-[1]: https://platform.openai.com/docs/api-reference/responses "OpenAI Platform"
-[2]: https://platform.openai.com/docs/quickstart?utm_source=chatgpt.com "Developer quickstart - OpenAI API"
-[3]: https://platform.openai.com/docs/api-reference/responses?utm_source=chatgpt.com "API Reference"
-[4]: https://platform.openai.com/docs/guides/structured-outputs?utm_source=chatgpt.com "Structured model outputs - OpenAI API"
-[5]: https://platform.openai.com/docs/guides/migrate-to-responses?utm_source=chatgpt.com "Migrate to the Responses API"
-[6]: https://platform.openai.com/docs/guides/function-calling?utm_source=chatgpt.com "Function calling - OpenAI API"
-[7]: https://platform.openai.com/docs/guides/tools?utm_source=chatgpt.com "Using tools - OpenAI API"
-[8]: https://platform.openai.com/docs/guides/text?utm_source=chatgpt.com "Text generation - OpenAI API"
+[1]: https://platform.openai.com/docs/api-reference/responses/create?utm_cta=website-homepage-industry-card-media&utm_source=chatgpt.com "API Reference"
+[2]: https://platform.openai.com/docs/api-reference/responses "OpenAI Platform"
+[3]: https://platform.openai.com/docs/guides/migrate-to-responses "OpenAI Platform"
+[4]: https://platform.openai.com/docs/api-reference/responses-streaming "OpenAI Platform"
+[5]: https://platform.openai.com/docs/api-reference/usage "OpenAI Platform"
+[6]: https://platform.openai.com/docs/guides/conversation-state "OpenAI Platform"
diff --git a/swagent-spec.md b/swagent-spec.md
@@ -1,320 +1,365 @@
-Perfect—let’s tune the plan so it’s workshop‑friendly, shows **a few cheeky lines** on startup, **doesn’t print compiler flags**, keeps **macOS 26** only in the **system prompt**, and **expands** the checks so the model + humans both have more to chew on.
+Nice—this feedback is gold. I baked **every missing specificity** into your stage prompts and the **one‑go build** so the agent has zero room to guess.
 
----
-
-## What changed (quick)
+Below you’ll find:
 
-* Startup prints **multiple cheeky lines** + masked API key; **no mention of Swift 6.2 or MainActor** in the output.
-* **System prompt** now explicitly contains **“Runtime: macOS 26 or later”** and is more detailed (pasted in full below).
-* **Stage acceptance checks** beefed up with explicit I/O, example transcripts, and file expectations.
-* **All‑in‑one (one‑go) build prompt** now embeds the **full system prompt**.
+* a **Responses API “contract”** (exact payloads, shapes, key casing)
+* a tighter **system prompt** (macOS 26, self‑testing, behavior when unsure)
+* **stage acceptance checks** that force the model to actually **call `run_bash`** and verify
+* an updated **one‑go build brief** with the full system prompt + a copy‑paste **crib sheet**
 
-> Build knobs you still apply in code: **Swift 6.2** + **default MainActor isolation** via SPM `SwiftSetting.defaultIsolation(MainActor.self)`; conversation chaining via **Responses API** with `previous_response_id`; **function tools** for finish/ask‑for‑info; and a **bash tool** with Y/n gating or `--yolo`. ([Swift.org][1])
+All references point to the official docs so agents don’t revert to legacy Chat Completions. ([OpenAI Platform][1])
 
 ---
 
-## Stage plan (5 steps, updated + expanded checks)
-
-### 1) “Hello, swagent” — minimal one‑shot
-
-**Build scope**
-
-* Swift 6.2; set default isolation at module level in **SPM**:
-
-  ```swift
-  // swift-tools-version: 6.2
-  // ...
-  .executableTarget(
-    name: "swagent",
-    // ...
-    swiftSettings: [
-      .defaultIsolation(MainActor.self) // SwiftPM 6.2
-    ]
-  )
-  ```
-
-  *Docs:* Swift 6.2 main‑actor default option; SPM `defaultIsolation`. ([Swift.org][1])
-* Deps: `swift-argument-parser` (CLI), `swift-configuration` (reads `OPENAI_API_KEY` from env). ([Apple GitHub][2])
-* Call **OpenAI Responses API** (`model: gpt-5-codex`) once; print the reply. Show token usage from `usage` + elapsed time. ([OpenAI Platform][3])
-
-**Runtime UX**
+## 🔒 Hardened API contract (copy/paste into your spec)
 
-* On launch, print **2–3 cheeky lines** (randomly sampled) + masked API key (`sk‑abc…def0`).
-* Flags: `--version`, `-v` (verbose HTTP codes + timings).
+**Endpoint**
 
-**Cheeky lines pool (example)**
-
-```
-• 🎩 “I code therefore I am. Hit me.”
-• 🧰 “Tabs, spaces, or chaos? Your call.”
-• ⚡ “One prompt. One shot. Make it count.”
-• 🧪 “If it compiles, we ship. Kidding. Mostly.”
-• 🐚 “Bashful? I’m not.”
+```http
+POST https://api.openai.com/v1/responses
+Authorization: Bearer $OPENAI_API_KEY
+Content-Type: application/json
 ```
 
-**Expanded checks**
+**Required fields & casing**
+
+* `model` (e.g., `"gpt-5-codex"`)
+* `instructions` (system‑like rules, string)
+* `input` (the user turn, string or array of items—string is fine here)
+* `store: true` if you plan to chain with `previous_response_id`
+* **Tools (function calling)** go in `tools` as **top‑level** objects:
+
+  ```json
+  {
+    "type": "function",
+    "name": "run_bash",
+    "description": "Run a bash command and return stdout, stderr, exitCode.",
+    "parameters": {
+      "type": "object",
+      "properties": {
+        "command": { "type": "string" },
+        "cwd":     { "type": "string" }
+      },
+      "required": ["command"]
+    }
+  }
+  ```
 
-* **Env**: with key → shows masked key; without → prints clear error about missing `OPENAI_API_KEY` (no stacktrace).
-* **CLI**:
+  > **Do not** use the old nested `function: { name, ... }` shape from Assistants. Responses uses **top‑level** `name/description/parameters`. ([OpenAI Platform][2])
 
-  * `swagent --version` → semantic version line only.
-  * `swagent -v "What’s 2+2?"` → prints cheeky intro (2–3 lines), masked key, HTTP 200 in verbose log, model text, and a footer `(in: X, out: Y, total: Z tokens, 0m 01s)`.
-* **Failure paths**: network error surfaces as a single‑line diagnostic in `-v` mode; non‑`-v` shows brief “request failed (HTTP NNN)”.
+**Conversation state**
 
-References for API, tokens, and arg parsing. ([OpenAI Platform][3])
+* To continue a conversation **without** resending the whole transcript, pass `previous_response_id` on the next call.
+* **Important:** `instructions` are **not carried over** with `previous_response_id`; **re‑send** your system prompt each turn. ([OpenAI Platform][3])
 
----
+**Usage metrics (footer numbers)**
 
-### 2) “Sticky chat” — interactive REPL + one‑shot `-p`
+* Current casing: `usage.input_tokens`, `usage.output_tokens`, `usage.total_tokens` (snake_case). Use these for the per‑turn stats footer. ([OpenAI Platform][4])
 
-**Build scope**
+**Output structure (what you’ll parse)**
 
-* Add REPL (loop until `/exit`).
-* Keep one‑shot via `-p "…"`.
-* Maintain conversation by passing **`previous_response_id`** each turn. **Always resend your system prompt** each request (instructions aren’t auto‑carried). ([OpenAI Platform][4])
+* `output` is an **array of items**. Expect **assistant messages** and possible **function calls**:
 
-**Expanded checks**
+  * Assistant text lives in a message item’s `content` as `{ "type": "output_text", "text": "..." }`. ([OpenAI Platform][3])
+  * Function calls appear as items of type **`function_call`** with:
 
-* **REPL basics**:
+    ```json
+    { "type": "function_call", "call_id": "call_…", "name": "run_bash", "arguments": "{\"command\":\"swift build\"}" }
+    ```
 
-  * Start `swagent` → cheeky lines + masked key → prompt `›`.
-  * Type `Hello` → model replies; prints per‑turn `(tokens, time)`.
-* **Commands**:
+    `arguments` is a **JSON string**. ([OpenAI Platform][5])
+* To **return tool results**, create a follow‑up `responses.create` with:
 
-  * `/new` (alias `/clear`) → response chain reset; next call has **no** `previous_response_id`.
-  * `/exit` → exits the process.
-  * `/status` (preview, wired in Stage 5) → prints “not persisted yet” message in Stage 2.
-* **One‑shot**: `swagent -p "Summarize rust vs swift"` → one response, stats footer, exit.
-* **State**: verify the second turn includes **`previous_response_id`** of the first. (You’ll see longer `in:` tokens owing to chaining.) ([OpenAI Platform][4])
+  * the same `model`
+  * `previous_response_id: "<id from the prior response>"`
+  * `tool_outputs: [ { "call_id": "<call id>", "output": "<stringified JSON like { stdout, stderr, exitCode }>" } ]`
+  * This yields a new response; continue until the model produces a final message or asks for more info. ([OpenAI Platform][6])
 
----
+**Minimal end‑to‑end example**
 
-### 3) “Agent signals” — finish / ask‑for‑info tools
+*Request (turn 1):*
 
-**Build scope**
+```json
+{
+  "model": "gpt-5-codex",
+  "instructions": "…swagent system rules…",
+  "input": "Init a Swift package and build it.",
+  "tools": [
+    { "type":"function", "name":"run_bash",
+      "description":"Run bash", "parameters":{
+        "type":"object","properties":{
+          "command":{"type":"string"}, "cwd":{"type":"string"}
+        },"required":["command"]
+      }},
+    { "type":"function","name":"request_more_info",
+      "parameters":{"type":"object","properties":{"question":{"type":"string"}},"required":["question"]}},
+    { "type":"function","name":"finish",
+      "parameters":{"type":"object","properties":{"summary":{"type":"string"}},"required":["summary"]}}
+  ],
+  "tool_choice": "auto",
+  "store": true
+}
+```
 
-* Add **two function tools** (Responses API `tools`):
+*Response (turn 1 → includes a function call):*
 
-  * `finish(summary: string)`
-  * `request_more_info(question: string)`
-* Implement tool‑call loop: when a tool is called, send its **tool output** back (bound to the exact `tool_call_id`), then continue. ([OpenAI Platform][5])
+```json
+{
+  "id": "resp_123",
+  "output": [
+    {
+      "type": "message",
+      "role": "assistant",
+      "content": [ { "type": "output_text", "text": "Creating the package, then building…" } ]
+    },
+    {
+      "type": "function_call",
+      "call_id": "call_abc",
+      "name": "run_bash",
+      "arguments": "{\"command\":\"swift package init --type executable\"}"
+    }
+  ],
+  "usage": { "input_tokens": 395, "output_tokens": 57, "total_tokens": 452 }
+}
+```
 
-**Tool JSON schemas (sketch)**
+*Request (turn 2 → return the tool output):*
 
 ```json
 {
-  "type": "function",
-  "name": "finish",
-  "description": "Signal task completion with a short summary and next steps.",
-  "parameters": { "type": "object", "properties": { "summary": { "type": "string" } }, "required": ["summary"] }
+  "model": "gpt-5-codex",
+  "instructions": "…swagent system rules…",
+  "previous_response_id": "resp_123",
+  "tool_outputs": [
+    {
+      "call_id": "call_abc",
+      "output": "{\"stdout\":\"initialized…\",\"stderr\":\"\",\"exitCode\":0}"
+    }
+  ]
 }
 ```
 
+*Response (turn 2 → may include another function call or final text):*
+
 ```json
 {
-  "type": "function",
-  "name": "request_more_info",
-  "description": "Ask the user for missing information to proceed.",
-  "parameters": { "type": "object", "properties": { "question": { "type": "string" } }, "required": ["question"] }
+  "id": "resp_124",
+  "output": [
+    { "type":"function_call", "call_id":"call_def", "name":"run_bash",
+      "arguments":"{\"command\":\"swift build\"}" }
+  ],
+  "usage": { "input_tokens": 117, "output_tokens": 12, "total_tokens": 129 }
 }
 ```
 
-**Expanded checks**
+…and so on until the model emits a message (with `output_text`) or calls `finish`. ([OpenAI Platform][1])
 
-* Prompt: “Draft a minimal README, ask me one clarifying question, then finish.”
+**Key deltas vs Chat Completions (so agents don’t regress)**
 
-  * Model calls `request_more_info` → CLI prints the question and waits for user input → you answer → model continues → model calls `finish` with a summary → CLI prints summary + stats and returns to prompt (REPL) or exits (`-p`).
-  * **Verify**: every assistant tool call is followed by a matching **tool output** before continuing (this is required by function‑calling semantics). If you skip it, you’ll hit tool‑output errors. ([OpenAI Platform][5])
+* `input` + `instructions` instead of `messages`
+* tool calls are **items** (type `function_call`) with `call_id` + `arguments` string
+* tool **results** are sent via `tool_outputs` (not “tool messages”)
+* conversation uses `previous_response_id` (not an array of past messages)
+  Docs walk through each difference. ([OpenAI Platform][5])
 
 ---
 
-### 4) “Run commands” — bash tool with guardrails
+## 🧠 Behavior policy (no guesswork)
 
-**Build scope**
+* **Default stance:** if confident, **answer & act**; if not, call `request_more_info(question)` with one precise question.
+* **Self‑testing is mandatory:** after proposing any command, **call `run_bash`** (post‑approval / `--yolo`) and verify results. If a step fails, iterate until fixed or ask for info.
+* **Live calls please:** run the **real** API end‑to‑end (we’ll provide an `OPENAI_API_KEY`). No stubs during the workshop.
+  Function‑calling & tool loop details: see Function Calling + Responses guides. ([OpenAI Platform][7])
 
-* Add `run_bash(command: string, cwd?: string)` tool:
+---
 
-  * On invocation, print the proposed command and ask **`Run? [Y/n]`** (Enter = Yes).
-  * `--yolo` auto‑approves.
-  * Execute with `bash -lc "<command>"`; capture `stdout`, `stderr`, `exitCode`; return as tool output.
-* `-p` mode: still **one‑shot to the user**, but the **agent may loop internally** across tools until it calls `finish` or `request_more_info`.
+## 🧾 System prompt (final, paste verbatim)
 
-**Expanded checks**
+> **You are swagent**, a coding agent for terminal workflows.
+> **Runtime:** **macOS 26 or later**.
+> **Mission:** Build, run, and refine code + shell workflows; verify your work.
+> **Behavior:**
+>
+> * Think step‑by‑step; prefer small diffs and working patches.
+> * When you propose commands, you **must** call `run_bash` to execute them (after user approval) and confirm results.
+> * If blocked, call `request_more_info(question)` with one precise, answerable question.
+> * When done, call `finish(summary)` with a concise summary + next steps.
+> * Don’t exfiltrate secrets; avoid destructive commands unless asked.
+> * Output stays terminal‑friendly and concise.
+>   **Tools:**
+>
+> 1. `run_bash(command: string, cwd?: string)` → return `{stdout, stderr, exitCode}`.
+> 2. `request_more_info(question: string)`
+> 3. `finish(summary: string)`
+>    **API rules (Responses API):**
+>
+> * Use `model: gpt-5-codex`.
+> * Re‑send these instructions every turn.
+> * Chain turns with `previous_response_id`.
+> * Tools are defined with top‑level `name/description/parameters` (JSON Schema).
+> * Tool calls arrive as `function_call` items with a `call_id`; return results using `tool_outputs` with the **same** `call_id`.
+> * Read `usage.input_tokens`, `usage.output_tokens`, `usage.total_tokens` for stats. ([OpenAI Platform][1])
 
-* `swagent -p "Create hello.sh that prints hello, make it executable, run it"`
+---
 
-  * Shows planned commands, asks Y/n, runs, shows `stdout: hello`.
-  * On failure (non‑zero exit), model sees `exitCode!=0` + `stderr` and retries.
-* `swagent --yolo -p "swift --version"` → executes without prompt; output returned to model; prints final message + stats.
-* **Audit**: ensure each tool call’s **tool output** uses the **same `tool_call_id`** field the model provided (Responses API requirement). ([OpenAI Platform][5])
+## 🧪 Stage plan (only the deltas that changed, with **self‑test** baked in)
 
----
+### Stage 1 — minimal one‑shot
 
-### 5) “Sessions, polish, tests” — persistence + status + lint/format
+* Startup: print **2–3 cheeky lines** + masked key (`sk‑abc…wxyz`).
+* One call to Responses; print reply + footer `(in/out/total, time)`.
+* **Checks (must pass)**
 
-**Build scope**
+  * `swagent -v "ping"` → shows cheeky lines, masked key, HTTP 200 log, model text, usage footer.
+  * Missing key → single‑line error.
+  * **Usage footer uses snake_case fields** from `usage`. ([OpenAI Platform][4])
 
-* Persist sessions under `~/.swagent/<uuid>.json` via an **actor** (async file I/O).
-* `/status` prints:
+### Stage 2 — sticky chat (`-p` still one‑shot)
 
-  * masked API key,
-  * totals for **input/output/total tokens** this session,
-  * **estimated context remaining** (based on model context limit and running total).
-* `--session <uuid>` resumes a saved chain.
-* On exit: print *“To resume this session, call `swagent --session <uuid>`.”*
-* Tests with **Swift Testing** (Xcode 26 includes it). ([GitHub][6])
-* Formatting/linting: **swift-format** + **SwiftLint** targets or scripts. ([GitHub][7])
+* Re‑send system prompt every turn; chain with `previous_response_id`.
+* **Checks**
 
-**Expanded checks**
+  * REPL: second turn uses the first turn’s `previous_response_id`.
+  * `/new` clears chain; next call has **no** `previous_response_id`. ([OpenAI Platform][8])
 
-* **Persistence**:
+### Stage 3 — agent signals
 
-  * Start chat, send two turns → exit → check `~/.swagent/<uuid>.json` exists and includes: latest `response_id`, cumulative `usage`, timestamps.
-  * `swagent --session <uuid>` → prints greeting + “session loaded” note, then the REPL prompt. Next turn continues with `previous_response_id` from file. ([OpenAI Platform][4])
-* **/status** shows something like:
+* Add `finish(summary)` and `request_more_info(question)`; implement tool loop.
+* **Checks**
 
-  ```
-  Session: 3B83C2A2-…-8F
-  API key: sk-abc…f789
-  Tokens used: in=1,820 out=980 total=2,800
-  Context headroom (est.): ~170k tokens left
-  ```
+  * Prompt: “Ask me one clarifying question, then summarize and finish.”
+    → Model calls `request_more_info` → you answer → model calls `finish` → summary printed + stats.
+  * Verify: each function call got a **matching** `tool_outputs` entry with the same `call_id`. ([OpenAI Platform][5])
 
-  (Token counts from **Responses API `usage`**, context estimate uses model limit minus running input/output tokens; show it as an estimate.) ([OpenAI Platform][8])
-* **Tests**:
+### Stage 4 — bash tool + guardrails
 
-  * `CLITests`: `--version`, `-p`, `--yolo`, `--session`.
-  * `SessionStoreTests`: save/load round‑trip; concurrent reads/writes guarded by the actor.
-  * `ToolApprovalTests`: Y/n prompt default acceptance; `--yolo` bypass.
-* **Lint/format**: `make fmt lint` passes (no warnings on default rules).
+* Tool: `run_bash(command, cwd?)`; Y/n prompt (Enter = Yes); `--yolo` auto‑approves.
+* Inherit env so the agent can run `swift build` and `swift run` with your real key available.
+* **Checks (self‑test required)**
 
----
+  * `swagent -p "Create hello.sh, chmod +x, run it"` → agent actually **calls `run_bash`** and shows `stdout: hello`.
+  * `swagent --yolo -p "swift --version"` → auto‑runs, returns output, then `finish`.
+  * Each tool call returned a **tool output** with the same `call_id`. ([OpenAI Platform][7])
 
-## System prompt (paste verbatim)
+### Stage 5 — sessions + `/status` + tests
 
-> **You are swagent**, a focused coding agent optimized for terminal workflows.
-> **Runtime:** **macOS 26 or later**.
-> **Mission:** Help the user build, run, and refine code and shell workflows efficiently.
-> **Behavior rules:**
->
-> * Think step‑by‑step; propose small diffs; prefer minimal, working patches.
-> * Only run shell commands via the `run_bash` tool after clearly proposing what will run.
-> * If missing info, call `request_more_info(question)`.
-> * When done, call `finish(summary)` with a concise summary + next steps.
-> * Never print or exfiltrate secrets; avoid destructive commands unless explicitly asked.
-> * Keep answers **terminal‑friendly** and concise.
->   **Tools available:**
->
-> 1. `run_bash(command: string, cwd?: string)` — execute a shell command and read its output.
-> 2. `request_more_info(question: string)` — ask the user for specifics and wait.
-> 3. `finish(summary: string)` — signal completion and stop.
->    **Conversation:** You are part of a chat. Treat each turn as a continuation. When the user uses one‑shot `-p`, you may internally loop tool calls until you either `finish` or you must `request_more_info`.
+* Persist `~/.swagent/<uuid>.json` (actor, async file I/O).
+* `/status`: masked key; `usage` totals; **estimated** remaining context.
+* `--session <uuid>` resumes; on exit: “To resume this session, call `swagent --session <uuid>`.”
+* Tests with **Swift Testing** (assume built‑in in your Xcode 26 setup).
+* **Checks**
 
-*(Note: we keep OS here; we intentionally don’t print compiler flags or isolation mode at runtime.)*
+  * Two turns → `/status` shows snake_case usage fields; `/exit` writes JSON with `previous_response_id`.
+  * `--session <uuid>` resumes and chains from file’s `previous_response_id`. ([OpenAI Platform][3])
 
 ---
 
-## One‑go build prompt (full brief, **now includes the system prompt**)
+## 🚀 One‑go build brief (give this to the model)
 
-> **Project name:** `swagent`
-> **Environment:** Swift 6.2 with **default MainActor isolation** (via `.defaultIsolation(MainActor.self)` in SPM), **macOS host**.
-> **Dependencies:** `swift-argument-parser`, `apple/swift-configuration`; built‑in **Swift Testing**; plus **swift-format** and **SwiftLint**.
-> **System prompt to embed (verbatim):**
-> *[insert the “System prompt (paste verbatim)” block above]*
-> **Build & UX requirements:**
+> **Project:** `swagent`
+> **Env:** Swift 6.2 (SPM `.defaultIsolation(MainActor.self)`), macOS host
+> **Deps:** `swift-argument-parser`, `apple/swift-configuration`, built‑in **Swift Testing**, plus **swift-format** and **SwiftLint**
+> **System prompt:** *(paste the “System prompt (final)” above verbatim)*
+> **Startup UX:** print 2–3 cheeky lines (random) + masked API key (first 3 + last 4).
+> **Flags:** `-v/--verbose`, `--version`, `-p <prompt>` (one‑shot UI; internal tool loop), `--yolo`, `--session <uuid>`.
+> **Chat & sessions:** interactive REPL with `/new` `/clear` `/status` `/exit`; chain with `previous_response_id` + `store:true`; **always include** the system prompt each turn; persist under `~/.swagent/<uuid>.json`.
+> **Tools (function calling):**
+>
+> * `run_bash(command, cwd?)` → prompt Y/n unless `--yolo`; execute with `bash -lc`; return `{stdout, stderr, exitCode}` (stringified JSON) via `tool_outputs` bound to the **same `call_id`**.
+> * `request_more_info(question)` → ask the user, then continue.
+> * `finish(summary)` → end and print summary.
+>   **Per‑turn stats:** print `(in: N, out: M, total: T tokens, 0m 00s)` from `usage`.
+>   **Self‑test (mandatory):**
 >
-> * **Startup:** print 2–3 cheeky lines (random), then the masked API key (first 3 + last 4). No mention of compiler flags or isolation modes.
-> * **Flags:** `-v/--verbose` (extra logs + HTTP status), `--version`, `-p <prompt>` (one‑shot to the user; internal tool loop), `--yolo` (auto‑approve tools), `--session <uuid>`.
-> * **Chat:** interactive REPL with `/new` `/clear` `/status` `/exit`. Maintain state via **`previous_response_id`** and **`store:true`**; **always** include the **system prompt** each turn.
-> * **Tools (function calling):**
+> 1. After generating code, **call `run_bash`** to run `swift build`.
+> 2. If tests exist, **call `run_bash`** to run `swift test`.
+> 3. **Call `run_bash`** to run:
 >
->   * `run_bash(command, cwd?)` → ask **Y/n** per call (Enter=Yes) unless `--yolo`. Run via `bash -lc`. Return `{stdout, stderr, exitCode}` bound to the same `tool_call_id`.
->   * `request_more_info(question)` → print question; wait for user input.
->   * `finish(summary)` → end; print summary.
-> * **Per‑turn stats:** print `(in: N, out: M, total: T tokens, 0m 00s)` using the **Responses API `usage`** and a monotonic timer.
-> * **Sessions:** persist under `~/.swagent/<uuid>.json`; `/status` shows masked key, token totals, and an **estimated** remaining context; on exit: *“To resume this session, call `swagent --session <uuid>`.”*
-> * **Engineering:** idiomatic Swift 6.2; strict concurrency (MainActor by default); state in `actor`s; DTOs `Sendable`; clean terminal text output; `swift test` with **Swift Testing**; `swift-format`/`SwiftLint` wired via scripts or SPM plugins.
-> * **Responses API details to follow precisely:** function tools + tool outputs, chaining via `previous_response_id`, `usage` tokens. ([OpenAI Platform][3])
+>    * `swift run swagent --version`
+>    * `swift run swagent -p "Echo hello"`
+>    * `swift run swagent --yolo -p "Create hello.sh and run it"`
+> 4. On any failure, inspect `stderr`, fix, and retry.
+>    **Responses API (use exactly this shape):** tools are **top‑level** (`type/name/description/parameters`); tool calls arrive as `function_call` items with `call_id` + `arguments` (JSON string); tool results go in `tool_outputs`; usage fields are `input_tokens`, `output_tokens`, `total_tokens`. Use `previous_response_id` for chaining and **re‑send** the system prompt each turn.
+>    **Docs:** API Reference (Responses), Function Calling, Conversation State / previous_response_id, streaming/output_text, migration notes. ([OpenAI Platform][1])
 
 ---
 
-## Snippets you can drop in
+## 📎 Responses API crib sheet (drop right into your repo/readme)
 
-**Cheeky greetings helper**
+**Request (with tools):**
 
-```swift
-enum Greetings {
-  static let pool: [[String]] = [
-    ["🎩 I code therefore I am.", "⚡ One prompt. One shot.", "🧰 Tabs, spaces, or chaos?"],
-    ["🧪 If it compiles, we ship.", "🐚 Bashful? I’m not.", "📦 Got packages? I do."],
-    ["🤖 Ship it?", "🔧 Small diffs, big wins.", "🧭 Point me at a repo."]
-  ]
-  static func random() -> [String] { pool.randomElement() ?? ["👋 Hey"] }
+```json
+{
+  "model": "gpt-5-codex",
+  "instructions": "<SYSTEM PROMPT HERE>",
+  "input": "<USER INPUT HERE>",
+  "tools": [
+    { "type":"function","name":"run_bash","description":"Run bash","parameters":{
+      "type":"object","properties":{"command":{"type":"string"},"cwd":{"type":"string"}},
+      "required":["command"]
+    }},
+    { "type":"function","name":"request_more_info","parameters":{
+      "type":"object","properties":{"question":{"type":"string"}},"required":["question"]
+    }},
+    { "type":"function","name":"finish","parameters":{
+      "type":"object","properties":{"summary":{"type":"string"}},"required":["summary"]
+    }}
+  ],
+  "tool_choice": "auto",
+  "store": true
 }
 ```
 
-**Tool schemas (Swift types → JSON)**
-
-```swift
-struct RunBashArgs: Codable { let command: String; let cwd: String? }
-struct MoreInfoArgs: Codable { let question: String }
-struct FinishArgs:   Codable { let summary: String }
-```
-
-Use these to build `tools: [...]` for the Responses API request. ([OpenAI Platform][3])
-
-**Session file shape (`~/.swagent/<uuid>.json`)**
+**Response (snippet):**
 
 ```json
 {
-  "id": "3B83C2A2-...-8F",
-  "createdAt": "2025-09-30T19:12:03Z",
-  "previous_response_id": "resp_abc123",
-  "usage": { "input_tokens": 1820, "output_tokens": 980, "total_tokens": 2800 },
-  "history": ["resp_abc123", "resp_def456"]
+  "id": "resp_abc",
+  "output": [
+    { "type":"message", "role":"assistant",
+      "content":[{ "type":"output_text", "text":"…" }] },
+    { "type":"function_call", "call_id":"call_123",
+      "name":"run_bash", "arguments":"{\"command\":\"swift build\"}" }
+  ],
+  "usage": { "input_tokens": 123, "output_tokens": 45, "total_tokens": 168 }
 }
 ```
 
-**Makefile mini**
+**Continue with tool output:**
 
-```makefile
-fmt: ; swift-format --in-place --recursive Sources Tests
-lint: ; swiftlint
-test: ; swift test -v
-check: fmt lint test
+```json
+{
+  "model": "gpt-5-codex",
+  "instructions": "<SYSTEM PROMPT HERE>",
+  "previous_response_id": "resp_abc",
+  "tool_outputs": [
+    { "call_id":"call_123",
+      "output":"{\"stdout\":\"…\",\"stderr\":\"\",\"exitCode\":0}" }
+  ]
+}
 ```
 
-Formatter / linter docs. ([GitHub][7])
+Refs for each bit: Responses API reference, `previous_response_id`, function‑calling loop, `output_text`, `usage` fields. ([OpenAI Platform][1])
 
 ---
 
-## Why the model will behave better with this setup
+## 💬 Startup cheeky lines (pool)
 
-* **Richer acceptance checks** = clearer affordances for tool loops, approval prompts, and session continuity (what to do next is unambiguous).
-* **System prompt now encodes runtime (macOS 26)** and strict tool etiquette, so the assistant knows how to “play by the rules” in every turn.
-* **SPM‑level default isolation** handles the concurrency foot‑guns without cluttering runtime output. ([Swift.org][1])
+* “🎩 I code therefore I am.”
+* “⚡ One prompt. One shot. Make it count.”
+* “🧪 If it compiles, we ship. Mostly.”
+* “🐚 Bashful? I’m not.”
+* “🔧 Small diffs, big wins.”
 
 ---
 
-## Sources
-
-* Swift 6.2 (default MainActor isolation & single‑threaded option). ([Swift.org][1])
-* SPM `SwiftSetting.defaultIsolation` docs (PackageDescription 6.2). ([Swift Documentation][9])
-* Swift Testing (toolchain framework). ([GitHub][6])
-* swift-configuration (env provider). ([GitHub][10])
-* ArgumentParser docs. ([Apple GitHub][2])
-* Responses API reference; function calling; conversation state & `previous_response_id`; usage tokens. ([OpenAI Platform][3])
-* swift-format; SwiftLint. ([GitHub][7])
-
-Want me to also push a tiny repo skeleton with the `Package.swift`, `CLI.swift`, `SessionStore` actor, and a `Swift Testing` smoke test so you can kick off Stage 1 instantly?
+Want me to fold these exact blocks into your Stage 1–5 “paste‑to‑build” prompts so you can run the workshop straight from slides?
 
-[1]: https://swift.org/blog/swift-6.2-released/?utm_source=chatgpt.com "Swift 6.2 Released"
-[2]: https://apple.github.io/swift-argument-parser/documentation/argumentparser/?utm_source=chatgpt.com "ArgumentParser | Documentation - Apple"
+[1]: https://platform.openai.com/docs/api-reference/responses "OpenAI Platform"
+[2]: https://platform.openai.com/docs/quickstart?utm_source=chatgpt.com "Developer quickstart - OpenAI API"
 [3]: https://platform.openai.com/docs/api-reference/responses?utm_source=chatgpt.com "API Reference"
-[4]: https://platform.openai.com/docs/guides/conversation-state?utm_source=chatgpt.com "Conversation state - OpenAI API"
-[5]: https://platform.openai.com/docs/guides/function-calling?utm_source=chatgpt.com "Function calling - OpenAI API"
-[6]: https://github.com/swiftlang/swift-testing?utm_source=chatgpt.com "swiftlang/swift-testing: A modern, expressive ..."
-[7]: https://github.com/swiftlang/swift-format?utm_source=chatgpt.com "Formatting technology for Swift source code"
-[8]: https://platform.openai.com/docs/api-reference/usage?utm_source=chatgpt.com "API Reference"
-[9]: https://docs.swift.org/swiftpm/documentation/packagedescription/swiftsetting/defaultisolation%28_%3A_%3A%29/?utm_source=chatgpt.com "defaultIsolation(_:_:)"
-[10]: https://github.com/apple/swift-configuration?utm_source=chatgpt.com "apple/swift-configuration: API package for reading ..."
+[4]: https://platform.openai.com/docs/guides/structured-outputs?utm_source=chatgpt.com "Structured model outputs - OpenAI API"
+[5]: https://platform.openai.com/docs/guides/migrate-to-responses?utm_source=chatgpt.com "Migrate to the Responses API"
+[6]: https://platform.openai.com/docs/guides/function-calling?utm_source=chatgpt.com "Function calling - OpenAI API"
+[7]: https://platform.openai.com/docs/guides/tools?utm_source=chatgpt.com "Using tools - OpenAI API"
+[8]: https://platform.openai.com/docs/guides/text?utm_source=chatgpt.com "Text generation - OpenAI API"
diff --git a/swagent-spec.md b/swagent-spec.md
@@ -0,0 +1,320 @@
+Perfect—let’s tune the plan so it’s workshop‑friendly, shows **a few cheeky lines** on startup, **doesn’t print compiler flags**, keeps **macOS 26** only in the **system prompt**, and **expands** the checks so the model + humans both have more to chew on.
+
+---
+
+## What changed (quick)
+
+* Startup prints **multiple cheeky lines** + masked API key; **no mention of Swift 6.2 or MainActor** in the output.
+* **System prompt** now explicitly contains **“Runtime: macOS 26 or later”** and is more detailed (pasted in full below).
+* **Stage acceptance checks** beefed up with explicit I/O, example transcripts, and file expectations.
+* **All‑in‑one (one‑go) build prompt** now embeds the **full system prompt**.
+
+> Build knobs you still apply in code: **Swift 6.2** + **default MainActor isolation** via SPM `SwiftSetting.defaultIsolation(MainActor.self)`; conversation chaining via **Responses API** with `previous_response_id`; **function tools** for finish/ask‑for‑info; and a **bash tool** with Y/n gating or `--yolo`. ([Swift.org][1])
+
+---
+
+## Stage plan (5 steps, updated + expanded checks)
+
+### 1) “Hello, swagent” — minimal one‑shot
+
+**Build scope**
+
+* Swift 6.2; set default isolation at module level in **SPM**:
+
+  ```swift
+  // swift-tools-version: 6.2
+  // ...
+  .executableTarget(
+    name: "swagent",
+    // ...
+    swiftSettings: [
+      .defaultIsolation(MainActor.self) // SwiftPM 6.2
+    ]
+  )
+  ```
+
+  *Docs:* Swift 6.2 main‑actor default option; SPM `defaultIsolation`. ([Swift.org][1])
+* Deps: `swift-argument-parser` (CLI), `swift-configuration` (reads `OPENAI_API_KEY` from env). ([Apple GitHub][2])
+* Call **OpenAI Responses API** (`model: gpt-5-codex`) once; print the reply. Show token usage from `usage` + elapsed time. ([OpenAI Platform][3])
+
+**Runtime UX**
+
+* On launch, print **2–3 cheeky lines** (randomly sampled) + masked API key (`sk‑abc…def0`).
+* Flags: `--version`, `-v` (verbose HTTP codes + timings).
+
+**Cheeky lines pool (example)**
+
+```
+• 🎩 “I code therefore I am. Hit me.”
+• 🧰 “Tabs, spaces, or chaos? Your call.”
+• ⚡ “One prompt. One shot. Make it count.”
+• 🧪 “If it compiles, we ship. Kidding. Mostly.”
+• 🐚 “Bashful? I’m not.”
+```
+
+**Expanded checks**
+
+* **Env**: with key → shows masked key; without → prints clear error about missing `OPENAI_API_KEY` (no stacktrace).
+* **CLI**:
+
+  * `swagent --version` → semantic version line only.
+  * `swagent -v "What’s 2+2?"` → prints cheeky intro (2–3 lines), masked key, HTTP 200 in verbose log, model text, and a footer `(in: X, out: Y, total: Z tokens, 0m 01s)`.
+* **Failure paths**: network error surfaces as a single‑line diagnostic in `-v` mode; non‑`-v` shows brief “request failed (HTTP NNN)”.
+
+References for API, tokens, and arg parsing. ([OpenAI Platform][3])
+
+---
+
+### 2) “Sticky chat” — interactive REPL + one‑shot `-p`
+
+**Build scope**
+
+* Add REPL (loop until `/exit`).
+* Keep one‑shot via `-p "…"`.
+* Maintain conversation by passing **`previous_response_id`** each turn. **Always resend your system prompt** each request (instructions aren’t auto‑carried). ([OpenAI Platform][4])
+
+**Expanded checks**
+
+* **REPL basics**:
+
+  * Start `swagent` → cheeky lines + masked key → prompt `›`.
+  * Type `Hello` → model replies; prints per‑turn `(tokens, time)`.
+* **Commands**:
+
+  * `/new` (alias `/clear`) → response chain reset; next call has **no** `previous_response_id`.
+  * `/exit` → exits the process.
+  * `/status` (preview, wired in Stage 5) → prints “not persisted yet” message in Stage 2.
+* **One‑shot**: `swagent -p "Summarize rust vs swift"` → one response, stats footer, exit.
+* **State**: verify the second turn includes **`previous_response_id`** of the first. (You’ll see longer `in:` tokens owing to chaining.) ([OpenAI Platform][4])
+
+---
+
+### 3) “Agent signals” — finish / ask‑for‑info tools
+
+**Build scope**
+
+* Add **two function tools** (Responses API `tools`):
+
+  * `finish(summary: string)`
+  * `request_more_info(question: string)`
+* Implement tool‑call loop: when a tool is called, send its **tool output** back (bound to the exact `tool_call_id`), then continue. ([OpenAI Platform][5])
+
+**Tool JSON schemas (sketch)**
+
+```json
+{
+  "type": "function",
+  "name": "finish",
+  "description": "Signal task completion with a short summary and next steps.",
+  "parameters": { "type": "object", "properties": { "summary": { "type": "string" } }, "required": ["summary"] }
+}
+```
+
+```json
+{
+  "type": "function",
+  "name": "request_more_info",
+  "description": "Ask the user for missing information to proceed.",
+  "parameters": { "type": "object", "properties": { "question": { "type": "string" } }, "required": ["question"] }
+}
+```
+
+**Expanded checks**
+
+* Prompt: “Draft a minimal README, ask me one clarifying question, then finish.”
+
+  * Model calls `request_more_info` → CLI prints the question and waits for user input → you answer → model continues → model calls `finish` with a summary → CLI prints summary + stats and returns to prompt (REPL) or exits (`-p`).
+  * **Verify**: every assistant tool call is followed by a matching **tool output** before continuing (this is required by function‑calling semantics). If you skip it, you’ll hit tool‑output errors. ([OpenAI Platform][5])
+
+---
+
+### 4) “Run commands” — bash tool with guardrails
+
+**Build scope**
+
+* Add `run_bash(command: string, cwd?: string)` tool:
+
+  * On invocation, print the proposed command and ask **`Run? [Y/n]`** (Enter = Yes).
+  * `--yolo` auto‑approves.
+  * Execute with `bash -lc "<command>"`; capture `stdout`, `stderr`, `exitCode`; return as tool output.
+* `-p` mode: still **one‑shot to the user**, but the **agent may loop internally** across tools until it calls `finish` or `request_more_info`.
+
+**Expanded checks**
+
+* `swagent -p "Create hello.sh that prints hello, make it executable, run it"`
+
+  * Shows planned commands, asks Y/n, runs, shows `stdout: hello`.
+  * On failure (non‑zero exit), model sees `exitCode!=0` + `stderr` and retries.
+* `swagent --yolo -p "swift --version"` → executes without prompt; output returned to model; prints final message + stats.
+* **Audit**: ensure each tool call’s **tool output** uses the **same `tool_call_id`** field the model provided (Responses API requirement). ([OpenAI Platform][5])
+
+---
+
+### 5) “Sessions, polish, tests” — persistence + status + lint/format
+
+**Build scope**
+
+* Persist sessions under `~/.swagent/<uuid>.json` via an **actor** (async file I/O).
+* `/status` prints:
+
+  * masked API key,
+  * totals for **input/output/total tokens** this session,
+  * **estimated context remaining** (based on model context limit and running total).
+* `--session <uuid>` resumes a saved chain.
+* On exit: print *“To resume this session, call `swagent --session <uuid>`.”*
+* Tests with **Swift Testing** (Xcode 26 includes it). ([GitHub][6])
+* Formatting/linting: **swift-format** + **SwiftLint** targets or scripts. ([GitHub][7])
+
+**Expanded checks**
+
+* **Persistence**:
+
+  * Start chat, send two turns → exit → check `~/.swagent/<uuid>.json` exists and includes: latest `response_id`, cumulative `usage`, timestamps.
+  * `swagent --session <uuid>` → prints greeting + “session loaded” note, then the REPL prompt. Next turn continues with `previous_response_id` from file. ([OpenAI Platform][4])
+* **/status** shows something like:
+
+  ```
+  Session: 3B83C2A2-…-8F
+  API key: sk-abc…f789
+  Tokens used: in=1,820 out=980 total=2,800
+  Context headroom (est.): ~170k tokens left
+  ```
+
+  (Token counts from **Responses API `usage`**, context estimate uses model limit minus running input/output tokens; show it as an estimate.) ([OpenAI Platform][8])
+* **Tests**:
+
+  * `CLITests`: `--version`, `-p`, `--yolo`, `--session`.
+  * `SessionStoreTests`: save/load round‑trip; concurrent reads/writes guarded by the actor.
+  * `ToolApprovalTests`: Y/n prompt default acceptance; `--yolo` bypass.
+* **Lint/format**: `make fmt lint` passes (no warnings on default rules).
+
+---
+
+## System prompt (paste verbatim)
+
+> **You are swagent**, a focused coding agent optimized for terminal workflows.
+> **Runtime:** **macOS 26 or later**.
+> **Mission:** Help the user build, run, and refine code and shell workflows efficiently.
+> **Behavior rules:**
+>
+> * Think step‑by‑step; propose small diffs; prefer minimal, working patches.
+> * Only run shell commands via the `run_bash` tool after clearly proposing what will run.
+> * If missing info, call `request_more_info(question)`.
+> * When done, call `finish(summary)` with a concise summary + next steps.
+> * Never print or exfiltrate secrets; avoid destructive commands unless explicitly asked.
+> * Keep answers **terminal‑friendly** and concise.
+>   **Tools available:**
+>
+> 1. `run_bash(command: string, cwd?: string)` — execute a shell command and read its output.
+> 2. `request_more_info(question: string)` — ask the user for specifics and wait.
+> 3. `finish(summary: string)` — signal completion and stop.
+>    **Conversation:** You are part of a chat. Treat each turn as a continuation. When the user uses one‑shot `-p`, you may internally loop tool calls until you either `finish` or you must `request_more_info`.
+
+*(Note: we keep OS here; we intentionally don’t print compiler flags or isolation mode at runtime.)*
+
+---
+
+## One‑go build prompt (full brief, **now includes the system prompt**)
+
+> **Project name:** `swagent`
+> **Environment:** Swift 6.2 with **default MainActor isolation** (via `.defaultIsolation(MainActor.self)` in SPM), **macOS host**.
+> **Dependencies:** `swift-argument-parser`, `apple/swift-configuration`; built‑in **Swift Testing**; plus **swift-format** and **SwiftLint**.
+> **System prompt to embed (verbatim):**
+> *[insert the “System prompt (paste verbatim)” block above]*
+> **Build & UX requirements:**
+>
+> * **Startup:** print 2–3 cheeky lines (random), then the masked API key (first 3 + last 4). No mention of compiler flags or isolation modes.
+> * **Flags:** `-v/--verbose` (extra logs + HTTP status), `--version`, `-p <prompt>` (one‑shot to the user; internal tool loop), `--yolo` (auto‑approve tools), `--session <uuid>`.
+> * **Chat:** interactive REPL with `/new` `/clear` `/status` `/exit`. Maintain state via **`previous_response_id`** and **`store:true`**; **always** include the **system prompt** each turn.
+> * **Tools (function calling):**
+>
+>   * `run_bash(command, cwd?)` → ask **Y/n** per call (Enter=Yes) unless `--yolo`. Run via `bash -lc`. Return `{stdout, stderr, exitCode}` bound to the same `tool_call_id`.
+>   * `request_more_info(question)` → print question; wait for user input.
+>   * `finish(summary)` → end; print summary.
+> * **Per‑turn stats:** print `(in: N, out: M, total: T tokens, 0m 00s)` using the **Responses API `usage`** and a monotonic timer.
+> * **Sessions:** persist under `~/.swagent/<uuid>.json`; `/status` shows masked key, token totals, and an **estimated** remaining context; on exit: *“To resume this session, call `swagent --session <uuid>`.”*
+> * **Engineering:** idiomatic Swift 6.2; strict concurrency (MainActor by default); state in `actor`s; DTOs `Sendable`; clean terminal text output; `swift test` with **Swift Testing**; `swift-format`/`SwiftLint` wired via scripts or SPM plugins.
+> * **Responses API details to follow precisely:** function tools + tool outputs, chaining via `previous_response_id`, `usage` tokens. ([OpenAI Platform][3])
+
+---
+
+## Snippets you can drop in
+
+**Cheeky greetings helper**
+
+```swift
+enum Greetings {
+  static let pool: [[String]] = [
+    ["🎩 I code therefore I am.", "⚡ One prompt. One shot.", "🧰 Tabs, spaces, or chaos?"],
+    ["🧪 If it compiles, we ship.", "🐚 Bashful? I’m not.", "📦 Got packages? I do."],
+    ["🤖 Ship it?", "🔧 Small diffs, big wins.", "🧭 Point me at a repo."]
+  ]
+  static func random() -> [String] { pool.randomElement() ?? ["👋 Hey"] }
+}
+```
+
+**Tool schemas (Swift types → JSON)**
+
+```swift
+struct RunBashArgs: Codable { let command: String; let cwd: String? }
+struct MoreInfoArgs: Codable { let question: String }
+struct FinishArgs:   Codable { let summary: String }
+```
+
+Use these to build `tools: [...]` for the Responses API request. ([OpenAI Platform][3])
+
+**Session file shape (`~/.swagent/<uuid>.json`)**
+
+```json
+{
+  "id": "3B83C2A2-...-8F",
+  "createdAt": "2025-09-30T19:12:03Z",
+  "previous_response_id": "resp_abc123",
+  "usage": { "input_tokens": 1820, "output_tokens": 980, "total_tokens": 2800 },
+  "history": ["resp_abc123", "resp_def456"]
+}
+```
+
+**Makefile mini**
+
+```makefile
+fmt: ; swift-format --in-place --recursive Sources Tests
+lint: ; swiftlint
+test: ; swift test -v
+check: fmt lint test
+```
+
+Formatter / linter docs. ([GitHub][7])
+
+---
+
+## Why the model will behave better with this setup
+
+* **Richer acceptance checks** = clearer affordances for tool loops, approval prompts, and session continuity (what to do next is unambiguous).
+* **System prompt now encodes runtime (macOS 26)** and strict tool etiquette, so the assistant knows how to “play by the rules” in every turn.
+* **SPM‑level default isolation** handles the concurrency foot‑guns without cluttering runtime output. ([Swift.org][1])
+
+---
+
+## Sources
+
+* Swift 6.2 (default MainActor isolation & single‑threaded option). ([Swift.org][1])
+* SPM `SwiftSetting.defaultIsolation` docs (PackageDescription 6.2). ([Swift Documentation][9])
+* Swift Testing (toolchain framework). ([GitHub][6])
+* swift-configuration (env provider). ([GitHub][10])
+* ArgumentParser docs. ([Apple GitHub][2])
+* Responses API reference; function calling; conversation state & `previous_response_id`; usage tokens. ([OpenAI Platform][3])
+* swift-format; SwiftLint. ([GitHub][7])
+
+Want me to also push a tiny repo skeleton with the `Package.swift`, `CLI.swift`, `SessionStore` actor, and a `Swift Testing` smoke test so you can kick off Stage 1 instantly?
+
+[1]: https://swift.org/blog/swift-6.2-released/?utm_source=chatgpt.com "Swift 6.2 Released"
+[2]: https://apple.github.io/swift-argument-parser/documentation/argumentparser/?utm_source=chatgpt.com "ArgumentParser | Documentation - Apple"
+[3]: https://platform.openai.com/docs/api-reference/responses?utm_source=chatgpt.com "API Reference"
+[4]: https://platform.openai.com/docs/guides/conversation-state?utm_source=chatgpt.com "Conversation state - OpenAI API"
+[5]: https://platform.openai.com/docs/guides/function-calling?utm_source=chatgpt.com "Function calling - OpenAI API"
+[6]: https://github.com/swiftlang/swift-testing?utm_source=chatgpt.com "swiftlang/swift-testing: A modern, expressive ..."
+[7]: https://github.com/swiftlang/swift-format?utm_source=chatgpt.com "Formatting technology for Swift source code"
+[8]: https://platform.openai.com/docs/api-reference/usage?utm_source=chatgpt.com "API Reference"
+[9]: https://docs.swift.org/swiftpm/documentation/packagedescription/swiftsetting/defaultisolation%28_%3A_%3A%29/?utm_source=chatgpt.com "defaultIsolation(_:_:)"
+[10]: https://github.com/apple/swift-configuration?utm_source=chatgpt.com "apple/swift-configuration: API package for reading ..."