Superwhisper
AI Models
Superwhisper ships with a catalog of transcription and language models. Some run on your device and never touch the internet. Others run in the cloud when you want the fastest or most accurate output. You can switch at any time.
How to read this page
Superwhisper runs two models back to back. A speech recognition model turns your voice into text. Then an optional language model rewrites that text in Super Mode so an email comes out like an email and a code prompt stays technical.
You can mix and match. On-device Parakeet into cloud Claude Sonnet is a common setup. So is Ultra into nothing at all for raw transcription. The speed and accuracy dots are relative rankings inside Superwhisper, not industry-wide scores.
Speech recognition
Audio is sent to the Superwhisper proxy, transcribed, and returned. Best for maximum accuracy and the fastest cloud latency.
| Model | Provider | Speed / Accuracy | Languages | Tier |
|---|---|---|---|---|
| Ultra | 100+ | Pro | ||
| S1-Voice | 100+ | Pro | ||
| Scribe V2 | 99 | Pro | ||
| Nova 3 | 36 | Pro | ||
| Nova 2 | 36 | Pro | ||
| Nova Medical | English | Pro |
Models run locally on your Mac, PC, or iPhone. Audio never leaves the device and nothing needs an internet connection. The Fast, Nano, and Standard Whisper models are on the free tier. The faster Parakeets and larger Whisper variants ship with Pro.
| Model | Provider | Speed / Accuracy | Languages | Size | Tier |
|---|---|---|---|---|---|
| Parakeet V2 | English | 476 MB | Pro | ||
| Parakeet V3 | 24 | 494 MB | Pro | ||
| Ultra V3 | 100+ | 3.0 GB | Pro | ||
| Ultra V3 Turbo | 100+ | 1.6 GB | Pro | ||
| Pro | 100+ | 1.5 GB | Pro | ||
| Standard | 100+ | 500 MB | Free | ||
| Nano | 100+ | 150 MB | Free | ||
| Fast | 100+ | 75 MB | Free |
Super Mode
Super Mode uses these to rewrite your transcript, translate it, or reformat it for the app you're in. Requests go through the Superwhisper proxy, so providers never see your account and nothing is retained for training.
| Model | Provider | Speed / Intelligence | Context | Tier |
|---|---|---|---|---|
| Claude Sonnet 4.6 | 1M | Pro | ||
| Claude Sonnet 4.5 | 200k | Pro | ||
| Claude Haiku 4.5 | 200k | Pro | ||
| GPT-5.4 mini | 400k | Pro | ||
| GPT-5.4 nano | 400k | Pro | ||
| GPT-5.3 Instant | 128k | Pro | ||
| GPT-5.2 | 400k | Pro | ||
| GPT-5.1 | 400k | Pro | ||
| GPT-5 mini | 400k | Pro | ||
| GPT-5 nano | 400k | Pro | ||
| Gemini 3 Flash | Google | 1M | Pro | |
| Gemini 3.1 Flash Lite | Google | 1M | Pro | |
| Grok 4.1 Fast | 2M | Pro | ||
| S1-Language | 128k | Pro | ||
| Llama 3.1 8B | 128k | Pro |
Run locally through llama.cpp on Apple Silicon or Windows. No internet needed. Size indicates the download on disk. Included with Pro.
| Model | Provider | Speed / Intelligence | Size | Tier |
|---|---|---|---|---|
| GPT OSS 20B | 14 GB | Pro | ||
| DeepSeek R1 Distill | 5.4 GB | Pro | ||
| Ministral 3 8B | 5.2 GB | Pro | ||
| Llama 3 8B | 4.9 GB | Pro | ||
| Mistral 7B v0.2 | 4.4 GB | Pro | ||
| Llama 3.2 3B | 1.9 GB | Pro | ||
| Phi-2 3B | 1.8 GB | Pro |
Methodology
Transcription speed and accuracy are anchored to the Hugging Face Open ASR Leaderboard. Parakeet V2 leads on real-time factor with an industry-best 6.05% word error rate for English. Whisper Large V3 scores slightly higher on accuracy but runs much slower, which is why Ultra V3 earns a 2 on speed and a 5 on accuracy while the Turbo variant evens out at 4 and 4.
Language model speed and intelligence are anchored to Artificial Analysis. Claude Sonnet 4.6 and the GPT-5.4 series sit at the top of the intelligence index alongside GPT-5.1 and GPT-5.2. Gemini 3.1 Flash Lite pushes above 200 tokens per second, which is why it gets the speed crown on the lite side. Groq's Llama 3.1 8B is faster still, but its smaller parameter count caps it at a 2 on intelligence.
The dots are relative to the other models in Superwhisper, not the entire industry. A 5 is the best in its class. A 1 is there for people who want a tiny download and can live with a rougher transcript.
Privacy
Every on-device model in this list runs locally. Your microphone input never leaves the machine, we do not log audio, and nothing about what you dictate is stored on our servers. You can work on a plane or inside a secure environment and the models behave the same way.
Cloud models are optional. When you pick one, audio and text go through the Superwhisper proxy to the provider and back. Providers see a proxy request, not your account. Nothing is retained for training. Enterprise customers can swap in their own API keys or host compatible models behind a VPC.
What people say

Andrej Karpathy
There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. I just talk to Composer with @superwhisperapp...

Pieter Levels
Tried @superwhisperapp today. Very nice. Lets me talk to Cursor and then it codes for me, just gets it right.
Support
For everyday voice-to-text on a Mac with Apple Silicon, Parakeet is a good starting point. It runs on your device, handles English well, and feels instant. If you want the best accuracy in any language, Ultra is the cloud transcription model we run. For Super Mode, Claude Sonnet 4.6 or GPT-5.2 will give you the best rewrites, GPT-5.4 mini is a solid default, and Haiku 4.5 or GPT-5.4 nano are the right call when you want faster output.
On-device models run locally. Your audio never leaves the computer, nothing is logged, and you can work offline. Cloud models are faster for their accuracy class because they run on larger servers, and they cover more languages. Superwhisper routes everything through its own proxy, so providers like OpenAI and Anthropic never see your account or your content.
We start with published benchmarks from the Hugging Face Open ASR Leaderboard for transcription and Artificial Analysis for LLMs, then adjust based on how the models actually feel in Superwhisper. The dots compare models inside Superwhisper, not across the whole industry. A 5 is the best in its class, not a claim that every 5-dot model is equivalent.
The free tier includes the Fast, Nano, and Standard Whisper models. They run entirely on-device, work offline, and are enough for most everyday voice-to-text. Pro unlocks Parakeet, the larger Whisper variants, Ultra in the cloud, and every Super Mode language model.
Yes. Enterprise customers can run Superwhisper against self-hosted models or against private endpoints from OpenAI, Anthropic, Google, or AWS Bedrock. Model access is controlled centrally through the admin dashboard.
Superwhisper is SOC 2 Type II certified and HIPAA compliant. On-device models never send audio anywhere. Cloud requests are proxied through our infrastructure, and providers don't retain your data for training.
We add new models as they ship. Claude Sonnet 4.6, the GPT-5.4 series, GPT-5.3 Instant, Gemini 3.1 Flash Lite, and Parakeet V3 landed within days of release. If a model you want isn't here yet, it's probably on the way.
Pick an on-device model and your audio stays on your machine — nothing is sent to us or any provider, and you can work fully offline. For teams handling PHI or regulated data, Superwhisper is HIPAA compliant and SOC 2 Type II certified, and Enterprise can route cloud requests through your own private endpoints. Read our guide on handling sensitive data.
Free tier runs the core Whisper models on-device. Pro unlocks Parakeet, Ultra, and every cloud model.