let's go 🚀

Getting Started

From zero to your first transcription in under 5 minutes.

⚠️
Requirements
  • macOS 14.0 (Sonoma) or later
  • Apple Silicon (M1 or later) — required for on-device ML
  • 8GB RAM minimum (16GB recommended for best models)
  • ~2–4GB disk space for AI models
1

Install YapYap

Choose your preferred installation method:

⬇ Direct Download recommended

Get the latest .dmg from GitHub Releases. Drag YapYap.app to your Applications folder.

Download .dmg →
🍺 Homebrew

Install and update via Homebrew Cask.

brew install --cask yapyap
🔨 Build from Source

For developers who want to modify and run locally.

git clone https://github.com/sunboylabs/YapYap.git
cd YapYap
brew install xcodegen
xcodegen generate
make build
2

Grant Permissions

YapYap needs two permissions. On first launch it will prompt you for both.

🎤
Microphone
Required to capture your voice. macOS will show a standard permission dialog when you first try to record. Click Allow.
Accessibility
Required to paste text into other apps. You need to toggle this manually in System Settings.
  1. Open System Settings → Privacy & Security → Accessibility
  2. Find YapYap in the list and toggle it ON
⚠️ If you build from source, you must re-grant this permission after every build. The code signature changes, and macOS silently revokes the grant.
3

Download Your First Model

YapYap ships without models — you choose what to download. On first launch, the onboarding flow walks you through this. Or: Settings → Models.

Recommended starter setup
STT: Parakeet TDT v3 ~600MB
Fastest, runs on Neural Engine (doesn't use your RAM budget). Good for English and 5 other languages.
LLM: Gemma 3 4B ~3.0GB
Best cleanup quality in the medium tier. 140+ languages. Recommended for 16GB+ Macs.
On 8GB RAM: use Qwen 2.5 1.5B (~1GB) instead.

Models download from HuggingFace and are stored in ~/Library/Application Support/YapYap/models/. After downloading, no internet is needed.

4

Make Your First Recording

⌥ Option + Space Push-to-Talk (default)
Hold down, speak, release. Text appears where your cursor is.
⌥ Option + ⇧ Shift + Space Hands-Free Mode
Toggle recording on/off. Auto-stops when it detects silence.

The floating bar (the little creature) appears while you're recording. The creature animates differently during different states — sleeping in the menu bar, attentive while listening, spinning while transcribing.

5

Explore the Modes

🎙️ Dictation
The default — speak naturally and get clean, formatted text pasted wherever your cursor is.
✏️ Command Mode
Highlight text in any app, press the hotkey, say a command ("make this more formal") — AI rewrites it in place.
📋 History
Browse and copy all past transcriptions from the popover or history window.
⚙️ Settings
Click the creature in the menu bar → Settings. Or use the keyboard shortcut shown in the popover.
🧠
Choose your models
Learn which speech and language model is best for your machine and use case.
Models guide →