Voice Dictation for Developers

Spokenly is a voice dictation app for macOS, Windows, Linux, and iOS. Press a shortcut, speak, and transcribed text appears at the cursor. It works in any app on your Mac, including Claude Code, Codex, and Cursor. On top of that, it connects to coding agents via MCP, so the agent can ask you questions and get voice answers directly.

Updated July 2026

Download Spokenly

Built-in Dictation vs Spokenly

Claude Code, Codex, and Cursor have all added some form of built-in dictation in 2026. These work for quick prompts inside each tool, but they use a single cloud transcription engine with limited accuracy for technical terms.

Spokenly is a standalone voice dictation app. Press a keyboard shortcut, speak, and text appears at the cursor in any app. It supports a wide range of transcription models: local Whisper and Parakeet for offline use, plus cloud options like GPT-4o Transcribe, Deepgram Nova, and Groq Whisper. This gives you significantly better accuracy for technical conversations than the built-in dictation in any coding agent.

Spokenly also integrates with coding agents via MCP. The agent can call Spokenly directly when it needs your input. You see the question, speak your answer, and the transcribed text goes back to the agent automatically.

How It Works

Spokenly runs a local MCP server

When Spokenly is open, it listens on localhost:51089. No cloud, no account needed.

Your AI tool connects to it

A one-time setup command connects your agent to Spokenly's MCP server. Takes under a minute.

Agent asks, you speak

When the agent needs input, it calls Spokenly's dictation tool. You see the question, speak your answer, and press Enter.

Transcribed text goes back to the agent

Spokenly transcribes your speech and sends the text back. The agent continues with your answer as context.

Supported Tools

Claude Code

stdio MCPCLI

Anthropic's CLI coding agent. Connects via stdio MCP with no response timeout.

Claude Cowork

stdio MCPDesktop

Anthropic's agentic AI for knowledge work. Connects via stdio MCP for voice input during autonomous tasks.

Codex CLI

HTTP MCPCLI

OpenAI's CLI coding agent. Connects via HTTP MCP for voice input during conversations.

Cursor

HTTP MCPIDE

AI-powered code editor. Connects via HTTP MCP for voice input in agent and composer modes.

Spokenly works with any tool that supports MCP. See the full setup guide for custom configurations.

Why Spokenly

Push-to-talk anywhere

Press a shortcut, speak, and text appears at the cursor. Works in any app on your Mac: terminal, editor, browser, email. One app for all your voice input.

Better transcription models

Choose from local Whisper, Parakeet, or cloud models like GPT-4o Transcribe, Deepgram Nova, and Groq. Much better accuracy for technical terms than built-in dictation in coding agents.

Agent-initiated voice via MCP

Spokenly also connects to any MCP-compatible agent. The agent calls Spokenly when it needs your input. You see the question, speak your answer, and text goes back automatically.

Frequently Asked Questions

What is MCP and why does voice input for coding agents need it?

MCP (Model Context Protocol) is an open standard that lets AI tools connect to external services. Spokenly runs a local MCP server that exposes voice dictation as a capability any MCP-compatible agent can call. When the agent needs your input, it triggers a recording prompt instead of waiting for typed text.

Which AI coding agents support voice input through Spokenly?

Claude Code (Anthropic), Claude Cowork (Anthropic), Codex CLI (OpenAI), and Cursor all support MCP and work with Spokenly. Any other tool that implements MCP can also connect. Spokenly supports both stdio and HTTP MCP transports.

Is voice input for coding agents free?

Yes. Spokenly's MCP server and local speech-to-text models (Whisper, Parakeet) are free with no limits. Cloud models are available via Pro plan or with your own API keys. No subscription required for local models.

Does my voice data stay private?

With local models, all speech processing happens on your Mac. Audio never leaves your device. You can enable Local Only Mode to ensure no network requests are made.

How accurate is voice-to-text for technical conversations?

Local Whisper and Parakeet models handle technical terms well, especially in English. For maximum accuracy with jargon-heavy dictation, Spokenly also supports cloud models like GPT-4o Transcribe and Deepgram Nova. Custom word replacements help fix recurring misrecognitions.

Can I use voice and keyboard input together?

Yes. When an agent calls the voice dictation tool, a recording prompt appears. Speak and press Enter, or press Escape to skip and type instead. You always have the choice.

Talk to your coding agent

Set up in under a minute. Free with local models.

Download Spokenly

Free MCP server

Local models included

Works offline