let's go 🚀

Getting Started

From zero to your first transcription in under 5 minutes.

⚠️

Requirements

macOS 14.0 (Sonoma) or later
Apple Silicon (M1 or later) — required for on-device ML
8GB RAM minimum (16GB recommended for best models)
~2–4GB disk space for AI models

Install YapYap

Choose your preferred installation method:

⬇ Direct Download recommended

Get the latest .dmg from GitHub Releases. Drag YapYap.app to your Applications folder.

Download .dmg →

🍺 Homebrew

Install and update via Homebrew Cask.

brew install --cask yapyap

🔨 Build from Source

For developers who want to modify and run locally.

git clone https://github.com/sunboylabs/YapYap.git
cd YapYap
brew install xcodegen
xcodegen generate
make build

Grant Permissions

YapYap needs two permissions. On first launch it will prompt you for both.

🎤

Microphone

Required to capture your voice. macOS will show a standard permission dialog when you first try to record. Click Allow.

♿

Accessibility

Required to paste text into other apps. You need to toggle this manually in System Settings.

Open System Settings → Privacy & Security → Accessibility
Find YapYap in the list and toggle it ON

⚠️ If you build from source, you must re-grant this permission after every build. The code signature changes, and macOS silently revokes the grant.

Download Your First Model

YapYap ships without models — you choose what to download. On first launch, the onboarding flow walks you through this. Or: Settings → Models.

Recommended starter setup

STT: Parakeet TDT v3 ~600MB

Fastest, runs on Neural Engine (doesn't use your RAM budget). Good for English and 5 other languages.

LLM: Gemma 3 4B ~3.0GB

Best cleanup quality in the medium tier. 140+ languages. Recommended for 16GB+ Macs.

On 8GB RAM: use Qwen 2.5 1.5B (~1GB) instead.

Models download from HuggingFace and are stored in ~/Library/Application Support/YapYap/models/. After downloading, no internet is needed.

Make Your First Recording

⌥ Option + Space Push-to-Talk (default)

Hold down, speak, release. Text appears where your cursor is.

⌥ Option + ⇧ Shift + Space Hands-Free Mode

Toggle recording on/off. Auto-stops when it detects silence.

The floating bar (the little creature) appears while you're recording. The creature animates differently during different states — sleeping in the menu bar, attentive while listening, spinning while transcribing.

Explore the Modes

🎙️ Dictation

The default — speak naturally and get clean, formatted text pasted wherever your cursor is.

✏️ Command Mode

Highlight text in any app, press the hotkey, say a command ("make this more formal") — AI rewrites it in place.

📋 History

Browse and copy all past transcriptions from the popover or history window.

⚙️ Settings

Click the creature in the menu bar → Settings. Or use the keyboard shortcut shown in the popover.

🧠

Choose your models

Learn which speech and language model is best for your machine and use case.

Models guide →