Open source
Read every line it runs.
Privacy claims are cheap; source code isn’t. Typurr is a single Rust binary you can build yourself — and the settings screen has nothing to hide because the config is a JSON file next to your models.
Windows · open source · 100% indoor
Typurr turns speech into clean, finished text at your cursor, in any Windows app — with speech and language models that live on your own hardware, like a well-fed house cat.
That correction wasn’t a typo fix. You said “thursday… actually friday” and it kept only what you meant.
Free and MIT licensed. Best on an NVIDIA GPU — real options for every other machine.
How it works
This is the whole pipeline. There is no fifth step where your voice visits a data center.
Push-to-talk, hands-free toggle, or just say “hey typurr” from across the room.
Whisper large-v3-turbo on your GPU, or Parakeet on CPU with a live transcript while you speak.
A 7B language model (llama.cpp, in-process) strips the “um”s, fixes punctuation, and resolves your self-corrections.
Injected into whatever has focus — editors, browsers, chat, even terminals, safely.
Warm end-to-end latency is ≈1 second on a consumer NVIDIA GPU. Models download once (a few GB) and live in a folder you can see.
The indoor-cat guarantee
bytes of your voice, text, or history. Ever.
There is no server: no account, no telemetry, no audio “retained to improve the service.” Your dictations, your corrections, and everything Typurr learns about how you speak live in plain files on your disk that you can open, read, and delete. Nothing to breach, nothing to subpoena, nothing phoning home in the night. (One honest asterisk: you can plug in your own AI provider key if your hardware needs it — that choice, and exactly what it changes, is spelled out below.)
Runs on your terms
Typurr is engine-swappable. Local is the default and the point — but not everyone has 8 GB of VRAM lying around, so here's the honest menu.
The full experience, fully local: Whisper large-v3-turbo for ears, a 7B model for the brain, ≈1s from voice to text. Nothing leaves. This is the build the accuracy bench guards.
Dictation stays fully local — Parakeet runs speech recognition fast on CPU, with the live transcript. Text cleanup runs on a small local model or through Ollama if you have it.
Point cleanup at any OpenAI-compatible provider — DeepSeek, Groq, OpenAI, OpenRouter… Your audio still never leaves: speech-to-text stays on your machine; only the transcript text goes to the provider you chose, under your own key.
The key never sits in plaintext: it's sealed with Windows DPAPI, bound to your user account on your machine — a copied config file leaks nothing. Every model the app downloads shows its license as it lands, and the Settings model picker lets you browse Hugging Face with license badges before choosing.
Features
Everything below is spoken, local, and optional — each one is a toggle in Settings.
Fillers dropped, punctuation added, and mid-sentence changes of mind resolved — say "…by thursday, actually friday" and only Friday survives. Four polish levels, from verbatim to full rewrite.
“…by thursday — actually friday”A streaming recognizer paints your words on the card as you say them — any engine, no length limit — and the cleaned text streams in as the model writes it. It even reads your pauses: "…at two… actually three" resolves the way you meant it, because the model hears how you spoke.
Read-backs and answers come from a neural voice running on your own CPU — not the robot from 1998. Start talking and it stops mid-sentence; you always have the floor.
“read it back” “stop talking”Every cleanup is checked against what you actually said — an invented name or a number from nowhere triggers an automatic correction pass. And it remembers where you undo or fix things, spending extra care exactly there.
Modes turn dictation into what you need: professional, casual, bullets, a spec builder that turns a brain-dump into requirements and acceptance criteria, a prompt-optimizer for AI tools, or your own. Select any text anywhere and one hotkey rewrites it in the current mode.
“using professional, …” “using spec, …”High-stakes modes hold their output on the card instead of typing it blind. Say “send it” and it lands where you were working; “scrap that” and it's gone — or just say what to change and the draft revises in place.
“make it tighter” “send it”The prompt-optimizer reads the window you're targeting, your clipboard, and the text already in the field — so “fix that error I copied” becomes a fully-specified instruction, not a vague wish.
“fix that error I copied”Switch windows, scroll, click buttons by their name, press key chords, type text verbatim, launch apps — explicit commands only, never guessed.
“switch to chrome” “click send” “press control s”It remembers facts you tell it and searches your own dictation history. Answers land in a scratchpad and are read aloud with a local voice.
“ask typurr, what did I tell Dana?” “stop talking”Ask about whatever's on the monitor you're using — a local vision model answers. Catch up on a long AI reply without reading it. The screenshot never leaves RAM on its way to the model.
“what’s on my screen?” “what did Claude say?”An opt-in local MCP server (loopback only) lets agents like Claude Code speak to you, ask you a question by voice and get your cleaned-up spoken answer back, shape text through your modes, and search your dictation history.
the agent asks — you just answerSay "typurr do…" and it plans across its own tools — recall past dictations, reshape them, insert, track, summarize the screen — and runs the whole chain. Requests outside its tools get an honest "can't do that", never an improvisation.
“typurr do find what I said about pricing and read it to me”An optional little Tracker window lists the commitments it hears in your dictations — and crosses them off when your own words say they're done. Add, tick, or toss anything by hand.
“typurr tracker” “typurr todo ship the report”Your vocabulary biases speech recognition. Your corrections become rules. Your edits teach it style — and you can export your own speech-to-text pairs and train a personal model on how you talk. All stored in plain files you can read, edit, or wipe.
“correct that layla to Rayla”Every so often it studies your recent usage and authors snippets, custom modes, and per-app rules for you. Everything it does is journaled and one click to revert — and it never re-learns what you've rejected.
Opt-in idle listening that starts hands-free dictation when you call it. It answers to “typer” too — it's a cat, not a speller.
“hey typurr”Voice todos that check themselves off. Snippets by trigger phrase. Template variables like clipboard and today's date. End with “send it” and Enter is pressed for you. Spoken code generation, per-app profiles with their own project lexicons, “read it back” when you want to hear what landed, a scratchpad for drafting.
“typurr todo ship the report” “in python, …” “send it”Get started
Grab the portable zip, unzip, run typurr.exe. Want a Start Menu entry and start-at-login? .\install.ps1 -Autostart does both; -Uninstall removes it and keeps your data.
First run downloads the speech + language models (a few GB) to %APPDATA%\Typurr\models — the overlay names each one and its license while it happens. One time only.
Release, and clean text lands at your cursor. Double-tap for hands-free. Say “what’s on my screen?” or “switch to chrome” when you’re ready for the rest.
Windows 10/11. Best with an NVIDIA GPU (≥8 GB VRAM, ≈1s latency) — solid options for every other machine. Prefer compiling it yourself? cargo build --release --features cuda — build docs in the repo. Verify anything with typurr --doctor.
Open source
Privacy claims are cheap; source code isn’t. Typurr is a single Rust binary you can build yourself — and the settings screen has nothing to hide because the config is a JSON file next to your models.