Talk Mode & TTS
Talk Mode & TTS
Talk Mode enables voice conversations with the agent — speech-to-text input and ElevenLabs text-to-speech output. It works via the OpenClaw iOS app, the TUI, and the control UI.
Prerequisites
- An ElevenLabs account with an API key
- OpenClaw with TTS support (2026.2.17+)
- For mobile: the OpenClaw iOS app built from source and paired to the gateway
Configuration
1. Add the ElevenLabs API Key
Add the key to your .env file:
echo "ELEVENLABS_API_KEY=your-key-here" >> ~/.openclaw/.envDocument it in .env.example (already done if using the Lobster repo config):
ELEVENLABS_API_KEY=your-elevenlabs-api-key2. Configure the Talk Section
Add the talk section to openclaw.json:
{ "talk": { "voiceId": "1SM7GgM6IMuvQlz2BwM3", "modelId": "eleven_v3", "outputFormat": "mp3_44100_128", "apiKey": "${ELEVENLABS_API_KEY}", "interruptOnSpeech": true }}| Field | Purpose |
|---|---|
voiceId | ElevenLabs voice ID (see Choosing a Voice) |
modelId | ElevenLabs model — eleven_v3 is the latest multilingual model |
outputFormat | Audio format — mp3_44100_128 is high quality, widely compatible |
apiKey | ${VAR} reference to your ElevenLabs API key |
interruptOnSpeech | Stop TTS playback when the user starts speaking |
3. Enable the TTS Tool
The tts tool must be in the agent’s alsoAllow list:
{ "tools": { "alsoAllow": ["tts", ...] }}4. Configure the SAG Skill (Optional)
The sag CLI provides advanced voice features — auditioning voices, file output, and speaker playback:
brew install sagAdd the skill config to openclaw.json:
{ "skills": { "entries": { "sag": { "apiKey": "${ELEVENLABS_API_KEY}" } } }}5. Restart the Gateway
openclaw gateway restartChoosing a Voice
Browse voices at elevenlabs.io/voice-library. Each voice has an ID you can copy.
Default voice: Mark — Casual, Relaxed and Light (1SM7GgM6IMuvQlz2BwM3).
To change the voice, update talk.voiceId in openclaw.json and restart the gateway.
You can also audition voices via the sag CLI skill — ask the agent to “try a different voice” and it can use sag to preview options.
Using Talk Mode
Via iOS App
- Build the OpenClaw iOS app from source (see the OpenClaw GitHub repo)
- Pair it to your gateway (scan the QR code from the dashboard)
- Tap the microphone icon to start a voice conversation
- Speech is transcribed to text, sent to the agent, and the reply is spoken back via ElevenLabs
Via TUI
openclaw tuiUse /tts on to enable voice replies in the terminal interface.
Via Control UI
Open the dashboard at http://127.0.0.1:18789/ and use the Talk Mode interface.
Toggle Per-Session
/tts on # Enable voice replies/tts off # Disable voice repliesVoice Replies in iMessage
When Talk Mode is active, the agent can send voice memos as iMessage attachments. This requires:
- BlueBubbles Private API enabled (for attachment sending)
- The
ttstool in the agent’s allowed tools - Audio format compatible with iMessage (MP3 or CAF)
Cost Considerations
ElevenLabs charges per character of text synthesized. Voice replies for typical agent responses (1-3 sentences) use roughly 100-300 characters each. Monitor usage at elevenlabs.io/app/usage.
Talk Mode vs the tts Tool
These are two different things:
- Talk Mode (gateway-native): The gateway automatically converts the agent’s text reply into audio using ElevenLabs. The agent just replies with plain text. This is what powers voice conversations via the iOS app, TUI, and control UI.
ttstool (agent-level): The agent explicitly calls thettstool to generate an MP3 file. Use this for proactive voice messages (e.g., sending a voice memo via iMessage), storytelling with thesagskill, or when you want audio output outside of Talk Mode.
Important: In Talk Mode sessions, the agent should reply with normal text — not call the tts tool. Calling tts and replying NO_REPLY bypasses the gateway’s audio pipeline and the user hears nothing.
Pricing Tiers
| Tier | Characters/month | Cost | Notes |
|---|---|---|---|
| Free | 10,000 | $0 | Good for testing |
| Starter | 30,000 | $5/mo | Recommended for personal use |
| Creator | 100,000 | $22/mo | For heavy voice usage |
Monitor usage at elevenlabs.io/app/usage.