things people ask 💬
Wait, it's actually free? What's the catch?
No catch. YapYap runs on open-source models (WhisperKit for STT, MLX for cleanup) that execute locally on your Mac's hardware. There are no servers to pay for, so there's nothing to charge you for. No trial, no word limits, no "upgrade for Pro."
Does my voice ever leave my computer?
Never. Unlike cloud dictation tools that send audio to external servers, YapYap processes everything on your Mac's Neural Engine. Your voice is transcribed locally, cleaned up locally, and discarded immediately. Zero network calls.
How fast is it compared to typing?
Most people speak at 130-170 WPM vs typing at 40-80 WPM. With YapYap, you get clean, formatted text at speaking speed — roughly 3x faster than typing. The cleanup adds under a second of processing time... well, most times. Maybe under 5 seconds tops if you have an older, slow Mac.
What Macs does it support?
Any Mac with Apple Silicon — M1, M2, M3, M4 and all their Pro/Max/Ultra variants. The on-device models require the Neural Engine, so Intel Macs are not supported.
How does it know which app I'm using?
YapYap uses macOS accessibility APIs to detect the frontmost application — the same way any menu bar app works. It categorizes apps (IDE, email, chat, docs, terminal, social) and adjusts the cleanup prompt. No screenshots, no screen recording, no content reading.
What AI models does it use?
For text cleanup, YapYap defaults to Gemma 4B — a great balance of quality and speed on Apple Silicon. You can also switch to Gemma 1B for faster processing on lower-end machines, or choose from Qwen and Llama family models depending on your preference. All run locally via MLX.
What about the speech-to-text engine?
Whisper (via WhisperKit) is the default — battle-tested and excellent for English. We also support NVIDIA's Parakeet models for faster inference, and Voxtral support is coming soon. You can swap engines in settings without losing any of your preferences.
WhisperKit delivers 93-95% raw accuracy. The LLM cleanup step then fixes grammar, removes filler words, and formats the output — bringing the final result very close to cloud-powered tools. For most real-world dictation, the difference is imperceptible.
Is it open source?
Fully. The source code is on GitHub under an open license. Inspect it, fork it, contribute to it. New Whisper models or LLM improvements can be integrated by anyone in the community. That's how YapYap keeps getting better — together.
Who built this?
YapYap was built by
Sandeep — a developer who wanted to build a free, privacy-first alternative to existing voice dictation apps. Questions, feedback, or just want to say hi? Reach out at
sandeeptnvs@gmail.com.