welcome to the docs ✏️

YapYap Documentation

Everything you need to get up and running, pick the right models for your machine, and tune the AI to write exactly the way you want.

z z

How it works

Every recording goes through this pipeline — in under 3 seconds on a good machine.

1
Audio Capture
AVAudioEngine captures 16kHz mono audio from your microphone while you hold the hotkey.
2
VAD Filtering
Silero VAD strips silence and background noise before it reaches the STT model — preventing Whisper hallucinations.
3
Speech-to-Text
Parakeet TDT v3 (Neural Engine) or Whisper (CoreML GPU) converts audio to raw text on-device.
4
LLM Cleanup
A local LLM (Qwen / Llama / Gemma via MLX, llama.cpp, or Ollama) removes fillers, fixes grammar, and formats for your active app.
5
Paste
Clean text is injected into your active app via clipboard + synthetic Cmd+V — no typing required.

Architecture

YapYap is a native Swift + SwiftUI app. No Electron, no web views, no cloud.

STT Layer
  • WhisperKit (CoreML)
  • FluidAudio / Parakeet (ANE)
  • whisper.cpp (GGML)
  • Apple SpeechAnalyzer (macOS 26+)
LLM Layer
  • MLX Swift (safetensors)
  • llama.cpp (GGUF)
  • Ollama (HTTP API)
Context Layer
  • NSWorkspace app detection
  • AX API window/field reading
  • 11 app categories
  • Per-category prompt rules
Data & UI
SwiftData (SQLite) AVAudioEngine Silero VAD Sparkle auto-update KeyboardShortcuts SwiftUI + AppKit hybrid

All models are stored in ~/Library/Application Support/YapYap/models/ — never in ~/Documents (iCloud eviction hazard).

👋
Ready to start?
Follow the Getting Started guide to be up and running in 5 minutes.
Get started →