Windows · open source · 100% indoor

Speak. It types.
It never goes outside.

Typurr turns speech into clean, finished text at your cursor, in any Windows app — with speech and language models that live on your own hardware, like a well-fed house cat.

listening

 

That correction wasn’t a typo fix. You said “thursday… actually friday” and it kept only what you meant.

Free and MIT licensed. Best on an NVIDIA GPU — real options for every other machine.

How it works

Four steps, one machine, about a second.

This is the whole pipeline. There is no fifth step where your voice visits a data center.

1You hold a key and talk

Push-to-talk, hands-free toggle, or just say “hey typurr” from across the room.

2Local ears

Whisper large-v3-turbo on your GPU, or Parakeet on CPU with a live transcript while you speak.

3Local brain

A 7B language model (llama.cpp, in-process) strips the “um”s, fixes punctuation, and resolves your self-corrections.

4Text lands at your cursor

Injected into whatever has focus — editors, browsers, chat, even terminals, safely.

Warm end-to-end latency is ≈1 second on a consumer NVIDIA GPU. Models download once (a few GB) and live in a folder you can see.

The indoor-cat guarantee

What leaves your machine, total:

0

bytes of your voice, text, or history. Ever.

There is no server: no account, no telemetry, no audio “retained to improve the service.” Your dictations, your corrections, and everything Typurr learns about how you speak live in plain files on your disk that you can open, read, and delete. Nothing to breach, nothing to subpoena, nothing phoning home in the night. (One honest asterisk: you can plug in your own AI provider key if your hardware needs it — that choice, and exactly what it changes, is spelled out below.)

Runs on your terms

No NVIDIA GPU? You still have good options.

Typurr is engine-swappable. Local is the default and the point — but not everyone has 8 GB of VRAM lying around, so here's the honest menu.

NVIDIA GPU

The full experience, fully local: Whisper large-v3-turbo for ears, a 7B model for the brain, ≈1s from voice to text. Nothing leaves. This is the build the accuracy bench guards.

Any modern CPU

Dictation stays fully local — Parakeet runs speech recognition fast on CPU, with the live transcript. Text cleanup runs on a small local model or through Ollama if you have it.

Bring your own API key

Point cleanup at any OpenAI-compatible provider — DeepSeek, Groq, OpenAI, OpenRouter… Your audio still never leaves: speech-to-text stays on your machine; only the transcript text goes to the provider you chose, under your own key.

The key never sits in plaintext: it's sealed with Windows DPAPI, bound to your user account on your machine — a copied config file leaks nothing. Every model the app downloads shows its license as it lands, and the Settings model picker lets you browse Hugging Face with license badges before choosing.

Features

A dictation app that grew into an assistant.

Everything below is spoken, local, and optional — each one is a toggle in Settings.

Cleans up how you actually talk

Fillers dropped, punctuation added, and mid-sentence changes of mind resolved — say "…by thursday, actually friday" and only Friday survives. Four polish levels, from verbatim to full rewrite.

“…by thursday — actually friday”

See it happen, live

Your words appear in the overlay as you speak, and the cleaned text streams in as the model writes it. No black box between your voice and your cursor.

Reshape anything

Modes turn dictation into what you need: professional, casual, bullets, a spec builder that turns a brain-dump into requirements and acceptance criteria, a prompt-optimizer for AI tools, or your own. Select any text anywhere and one hotkey rewrites it in the current mode.

“using professional, …” “using spec, …”

Drafts you approve by voice

High-stakes modes hold their output on the card instead of typing it blind. Say “send it” and it lands where you were working; “scrap that” and it's gone — or just say what to change and the draft revises in place.

“make it tighter” “send it”

Prompts grounded in your machine

The prompt-optimizer reads the window you're targeting, your clipboard, and the text already in the field — so “fix that error I copied” becomes a fully-specified instruction, not a vague wish.

“fix that error I copied”

Run your computer by voice

Switch windows, scroll, click buttons by their name, press key chords, type text verbatim, launch apps — explicit commands only, never guessed.

“switch to chrome” “click send” “press control s”

Ask it things — it answers out loud

It remembers facts you tell it and searches your own dictation history. Answers land in a scratchpad and are read aloud with a local voice.

“ask typurr, what did I tell Dana?” “stop talking”

Sees your screen, locally

Ask about whatever's on the monitor you're using — a local vision model answers. Catch up on a long AI reply without reading it. The screenshot never leaves RAM on its way to the model.

“what’s on my screen?” “what did Claude say?”

Gives your AI agents a voice

An opt-in local MCP server (loopback only) lets agents like Claude Code speak to you, ask you a question by voice and get your cleaned-up spoken answer back, shape text through your modes, and search your dictation history.

the agent asks — you just answer

Keeps track of what you said you'd do

An optional little Tracker window lists the commitments it hears in your dictations — and crosses them off when your own words say they're done. Add, tick, or toss anything by hand.

“typurr tracker” “typurr todo ship the report”

Learns you

Your vocabulary biases speech recognition. Your corrections become rules. Your edits teach it style. All stored in plain JSON you can read, edit, or wipe.

“correct that layla to Rayla”

Writes its own shortcuts

Every so often it studies your recent usage and authors snippets, custom modes, and per-app rules for you. Everything it does is journaled and one click to revert — and it never re-learns what you've rejected.

Wake word

Opt-in idle listening that starts hands-free dictation when you call it. It answers to “typer” too — it's a cat, not a speller.

“hey typurr”

The small conveniences that add up

Voice todos that check themselves off. Snippets by trigger phrase. Template variables like clipboard and today's date. End with “send it” and Enter is pressed for you. Spoken code generation, per-app profiles with their own project lexicons, “read it back” when you want to hear what landed, a scratchpad for drafting.

“typurr todo ship the report” “in python, …” “send it”

Get started

Installed to first dictation in three steps.

1Download & run

Grab the portable zip, unzip, run typurr.exe. Want a Start Menu entry and start-at-login? .\install.ps1 -Autostart does both; -Uninstall removes it and keeps your data.

2Let the models land

First run downloads the speech + language models (a few GB) to %APPDATA%\Typurr\models — the overlay names each one and its license while it happens. One time only.

3Hold Ctrl+Win and talk

Release, and clean text lands at your cursor. Double-tap for hands-free. Say “what’s on my screen?” or “switch to chrome” when you’re ready for the rest.

Windows 10/11. Best with an NVIDIA GPU (≥8 GB VRAM, ≈1s latency) — solid options for every other machine. Prefer compiling it yourself? cargo build --release --features cuda — build docs in the repo. Verify anything with typurr --doctor.

Open source

Read every line it runs.

Privacy claims are cheap; source code isn’t. Typurr is a single Rust binary you can build yourself — and the settings screen has nothing to hide because the config is a JSON file next to your models.

Star it on GitHub