Scroll to explore
Simple: Hold ⌘, speak, text appears in any app
Routing: Hold ⌘, tap 1 → Transform → Send to AI agent
Free 5 min/day · No account required
Voice-to-agent in milliseconds. Everything runs on your Mac.
Your voice never leaves your Mac. Speech recognition runs on the Apple Neural Engine via CoreML. AI transforms also run locally. No internet required.
Multiple speech models to choose from. Fast English recognition or 100+ languages. Auto-detect or set your language explicitly.
Press the key — recording starts in 20 milliseconds. Other tools take 200ms+ and clip your first word. When you're rapid-firing tasks to 3 agents, every lost word means repeating yourself.
Plain TOML config + shell scripts. Your AI agent can add channels, write transforms, and configure routing — programmatically. The tool evolves with your workflow.
⌘+1 sends to Claude Code. ⌘+2 sends to Slack. ⌘+3 creates a Linear issue. Each hotkey is a programmable channel with its own transform pipeline and destination. No other tool does this.
Your rambling speech becomes a structured task. Transform scripts turn "fix the auth bug where tokens expire" into a formatted Problem/Steps/Priority prompt that your agent acts on immediately.
Industry First
Voice recognition and AI text transform — both running on your Mac. Zero data leaves your device.
Trusted by developers and creators
"With other apps I kept losing the first word and had to repeat myself. SpeechButton captures everything from the first syllable. Saved me 2 hours a day on documentation alone."
"I walk around the room and code by voice. My back pain is gone. Wrote 40% more code last month just by dictating instead of typing. Development feels completely different now."
"I manage 5 AI agents just by talking. Hold a key, give a task, release. Cut context-switching time by 70%. The channel routing is a game changer for multi-agent workflows."
Free 5 min/day · No account required
Start free. Upgrade when you need more.
Voice dictation is 3× faster than typing. See what that means for you.
*Based on $50/hour average developer rate. Voice is ~3x faster than typing.
Both plans: 100% local. Your voice never leaves your Mac. No cloud. No data collection.
Free 5 min/day · No account required
SpeechButton transcribes in chunks as you speak. Every time you pause briefly, the chunk is transcribed and pasted instantly. When you finish talking, almost all text is already there.
With auto-Enter enabled, a longer silence (3s) sends the full message automatically. Hold the hotkey, talk to Claude Code or ChatGPT, pause briefly between thoughts — chunks appear in realtime. Stop talking for 3 seconds — message is sent. No keyboard needed.
# Hands-free VAD
[vad]
enabled = true
chunk_silence_sec = 0.7
[global]
auto_send = true
send_delay_sec = 3.0
Your speech goes through a pipeline: voice → raw text → prompt-driven transform → destination. You control the transform with a prompt.
Text doesn't just go into your active app. Route your speech to AI agents, work tools, social media, and more — each with its own hotkey channel.
Each destination is a [[hotkey]] channel in your config. Press ⌘+1, ⌘+2, ⌘+3 — each routes to a different place.
Hands-free agent workflows, voice-driven automation, and programmable text pipelines.
Voice Activity Detection sends text as you speak. Combined with auto-Enter, you can talk to AI agents (Claude Code, ChatGPT, Slack bots) without touching the keyboard at all.
All settings in a single config.toml file. AI agents can configure SpeechButton programmatically — no GUI needed. Changes apply instantly without restart.
Process text before it's pasted: run it through a script, send it to an LLM API, or transform it locally. Get cleaned-up, formatted, or translated text — all from your voice.
Hold Command, then press 1, 2, or 3 to route your speech to different destinations. Send voice to one agent, then switch to another with a different channel — perfect for multi-agent workflows.
Send transcribed text to a webhook URL for integrations, or log everything to a file for history and audit. All outputs work simultaneously — paste, file, webhook, and exec at once.
Use your iPhone as an external microphone. With keep_hot = true the mic stays always-on — no 300ms wake-up delay when you start talking. Same blazing fast response, even over wireless.
Choose output_format = "text" for plain text or "json" for structured data with timestamps, language, and confidence — ideal for programmatic pipelines.
[global]
model = "parakeet-tdt-0.6b-v3-int8"
# Default: paste transcription at cursor
[[hotkey]]
key = "RightCommand"
name = "default"
paste = "accessibility"
# ⌘+1: clean up speech with Local AI (free, offline)
[[hotkey]]
key = "RightCommand"
channel = "1"
name = "cleanup"
transform = "prompts/cleanup.md"
# ⌘+2: translate to English with Local AI
[[hotkey]]
key = "RightCommand"
channel = "2"
name = "translate"
transform = "prompts/translate_en.md"
# ⌘+3: send to Slack
[[hotkey]]
key = "RightCommand"
channel = "3"
name = "slack"
transform = "prompts/slack_message.md"
exec = "SLACK_WEBHOOK_URL=https://hooks.slack.com/xxx integrations/send_slack.py"
# ⌘+4: create Linear issue
[[hotkey]]
key = "RightCommand"
channel = "4"
name = "linear"
transform = "prompts/linear_issue.md"
exec = "LINEAR_API_KEY=lin_api_xxx integrations/send_linear.py"
# Auto-detect iPhone mic, keep stream hot
[[device_rule]]
match = "iPhone"
keep_hot = true
Your AI agents are waiting for instructions. Stop typing them.
Download for macOSRequires macOS 14 Sonoma or later · Apple Silicon (M1+)