# Project Overview ## Overview Yakety is a real-time speech-to-text application that provides instant transcription through keyboard shortcuts. It records audio while a hotkey is held down, transcribes the speech using OpenAI's Whisper model, and automatically pastes the transcribed text into the active application. The application is designed for efficient voice-to-text input across desktop workflows. The project targets both CLI and GUI usage patterns, supporting macOS and Windows with platform-specific implementations. It integrates whisper.cpp for on-device transcription, eliminating the need for cloud services while maintaining privacy. The application features a system tray interface for GUI mode and comprehensive keyboard monitoring for seamless user interaction. ## Key Files - **src/main.c**: Primary application entry point containing initialization sequence, audio processing pipeline, and keyboard event handling (lines 254-388) - **src/app.h**: Cross-platform application framework with platform-specific entry point macros (lines 6-43) and async execution utilities - **CMakeLists.txt**: Build system configuration managing whisper.cpp integration (lines 28-32), platform-specific compilation (lines 48-85), and distribution packaging (lines 358-535) - **src/transcription.cpp**: Whisper model integration and audio processing core (lines 49-100) ## Technology Stack - **Audio Processing**: miniaudio library for cross-platform audio capture in src/audio.c with 16kHz mono configuration (lines 9-11) - **Speech Recognition**: whisper.cpp integration for local transcription processing in src/transcription.cpp (lines 14-15) - **Platform Abstraction**: C-style C++ implementation with platform-specific modules in src/mac/ and src/windows/ - **Build System**: CMake with custom modules in cmake/ directory, supporting Ninja and Visual Studio generators - **GUI Framework**: - macOS: Objective-C/Swift UI in src/mac/dialogs/ with SwiftUI dialogs - Windows: Win32 API in src/windows/ with native dialog implementations ## Platform Support **macOS Requirements**: - Minimum macOS 14.0 (Apple Silicon only, set in CMakeLists.txt line 22) - Accessibility permissions for keyboard monitoring (handled in src/main.c lines 78-117) - Metal acceleration support via ggml-metal library integration - System tray menubar interface in src/mac/menu.m **Windows Requirements**: - Windows 10+ with Visual Studio 2022 build tools - Optional Vulkan support for GPU acceleration - WSL development environment supported via scripts in wsl/ directory - System tray interface in src/windows/menu.c **Cross-Platform Components**: - Keyboard monitoring: src/mac/keylogger.c and src/windows/keylogger.c - Audio recording: src/audio.c with platform-specific audio device handling - Preferences storage: src/preferences.c with platform-specific configuration paths - Model management: src/models.c with bundled and downloadable Whisper models defined in src/model_definitions.h