hugs · May 6, 2025 21:59 · May 6, 2025
diff --git a/valet.md b/valet.md
@@ -0,0 +1,95 @@
+# Valet Vision Architecture Overview
+
+Valet Vision is a flexible, open-source automation platform built on the Raspberry Pi. It enables precise control and observation of mobile devices through a combination of computer vision, hardware control, and network-accessible APIs.
+
+---
+
+## 📐 Architecture
+
+At its core, **Valet Vision** runs an HTTP server on a Raspberry Pi. This server exposes a simple JSON-over-HTTP API to:
+
+- Access the **camera feed**
+- Simulate **virtual mouse**, **keyboard**, and **stylus** inputs
+- Capture **screenshots** and stream **live MJPEG video**
+
+All requests can be made locally or over the network. The API is platform-agnostic, allowing automation scripts to run:
+
+- **Locally on the Valet** itself
+- On another machine within the same network
+
+🗺️ Architecture diagram: [valetnet.dev/overview](https://valetnet.dev/overview/)
+
+---
+
+## 🔌 USB Gadget Protocol
+
+Valet Vision uses Linux’s **USB Gadget** protocol to emulate USB peripherals to a connected mobile device. This includes:
+
+- Virtual **touch stylus**
+- USB **keyboard** and **mouse**
+- Optional **Ethernet gadget** mode to share or restrict the Pi’s network access with the mobile device
+
+This allows both input simulation and network configuration of the attached phone or tablet, all via a single USB connection.
+
+---
+
+## 👁️ Vision + AI Capabilities
+
+- Screenshots can be fetched as `image/jpeg` or `image/png`
+- A **live video stream** is available as an MJPEG feed over HTTP
+
+Valet Vision includes:
+
+- **OpenCV** for computer vision and object detection
+- **Tesseract OCR** for text recognition
+
+These tools allow automation scripts to detect and act on visual UI elements in screenshots.
+
+For more demanding AI workloads, you have options:
+- Add the [Raspberry Pi AI Kit (GPU accelerator)](https://www.raspberrypi.com/products/ai-kit/)
+- Run ML inference remotely on a more powerful machine
+- Use hosted services to offload heavy image processing
+
+Importantly, **Valet Vision operates fully offline by default**—it does not require or depend on any cloud services.
+
+---
+
+## 📲 Push Button Module (PBM)
+
+For full-device automation, Valet Vision can control hardware side buttons (e.g., power, volume up/down) via an optional **Push Button Module** (PBM):
+
+- PBM actuators are **digitally controlled servos**
+- Connected via **Dynamixel Protocol 2.0** over serial from the Pi
+- Placement is flexible—PBM arms can be positioned on either side of the device
+
+Support for PBM control via HTTP API is on the roadmap.
+
+🔗 More on the Dynamixel Protocol: [emanual.robotis.com](https://emanual.robotis.com/docs/en/dxl/protocol2/)
+
+---
+
+## 🧪 Developer Tools & Open Source
+
+The software that powers Valet Vision is fully open source:
+
+- Core Server: [checkbox-server on GitHub](https://github.com/tapsterbot/checkbox-server)
+- Python Client: [checkbox-client-python](https://github.com/tapsterbot/checkbox-client-python)
+
+### Example: Simulate a Tap
+```bash
+curl -X POST $HOST/api/touch/tap
+-H "Content-Type: application/json"
+-d '{"x": 0, "y": 0}'
+```
+
+---
+
+## 🧩 Modular, Local-First Automation
+
+Valet Vision is designed to be adaptable:
+
+- Fully autonomous (all logic and inference on-device)
+- Or controlled remotely from any machine on the same network
+- No mandatory cloud infrastructure
+
+This makes it ideal for labs, local QA environments, and regulated industries where network control and data locality matter.