Talk Mode & TTS

Talk Mode enables voice conversations with the agent — speech-to-text input and ElevenLabs text-to-speech output. It works via the OpenClaw iOS app, the TUI, and the control UI.

Prerequisites

An ElevenLabs account with an API key
OpenClaw with TTS support (2026.2.17+)
For mobile: the OpenClaw iOS app built from source and paired to the gateway

Configuration

1. Add the ElevenLabs API Key

Add the key to your .env file:

echo "ELEVENLABS_API_KEY=your-key-here" >> ~/.openclaw/.env

Document it in .env.example (already done if using the Lobster repo config):

ELEVENLABS_API_KEY=your-elevenlabs-api-key

2. Configure the Talk Section

Add the talk section to openclaw.json:

{
  "talk": {
    "voiceId": "1SM7GgM6IMuvQlz2BwM3",
    "modelId": "eleven_v3",
    "outputFormat": "mp3_44100_128",
    "apiKey": "${ELEVENLABS_API_KEY}",
    "interruptOnSpeech": true
  }
}

Field	Purpose
`voiceId`	ElevenLabs voice ID (see Choosing a Voice)
`modelId`	ElevenLabs model — `eleven_v3` is the latest multilingual model
`outputFormat`	Audio format — `mp3_44100_128` is high quality, widely compatible
`apiKey`	`${VAR}` reference to your ElevenLabs API key
`interruptOnSpeech`	Stop TTS playback when the user starts speaking

3. Enable the TTS Tool

The tts tool must be in the agent’s alsoAllow list:

{
  "tools": {
    "alsoAllow": ["tts", ...]
  }
}

4. Configure the SAG Skill (Optional)

The sag CLI provides advanced voice features — auditioning voices, file output, and speaker playback:

brew install sag

Add the skill config to openclaw.json:

{
  "skills": {
    "entries": {
      "sag": {
        "apiKey": "${ELEVENLABS_API_KEY}"
      }
    }
  }
}

5. Restart the Gateway

openclaw gateway restart

Choosing a Voice

Browse voices at elevenlabs.io/voice-library. Each voice has an ID you can copy.

Default voice: Mark — Casual, Relaxed and Light (1SM7GgM6IMuvQlz2BwM3).

To change the voice, update talk.voiceId in openclaw.json and restart the gateway.

You can also audition voices via the sag CLI skill — ask the agent to “try a different voice” and it can use sag to preview options.

Using Talk Mode

Via iOS App

Build the OpenClaw iOS app from source (see the OpenClaw GitHub repo)
Pair it to your gateway (scan the QR code from the dashboard)
Tap the microphone icon to start a voice conversation
Speech is transcribed to text, sent to the agent, and the reply is spoken back via ElevenLabs

Via TUI

openclaw tui

Use /tts on to enable voice replies in the terminal interface.

Via Control UI

Open the dashboard at http://127.0.0.1:18789/ and use the Talk Mode interface.

Toggle Per-Session

/tts on     # Enable voice replies
/tts off    # Disable voice replies

Voice Replies in iMessage

When Talk Mode is active, the agent can send voice memos as iMessage attachments. This requires:

BlueBubbles Private API enabled (for attachment sending)
The tts tool in the agent’s allowed tools
Audio format compatible with iMessage (MP3 or CAF)

Cost Considerations

ElevenLabs charges per character of text synthesized. Voice replies for typical agent responses (1-3 sentences) use roughly 100-300 characters each. Monitor usage at elevenlabs.io/app/usage.

Talk Mode vs the `tts` Tool

These are two different things:

Talk Mode (gateway-native): The gateway automatically converts the agent’s text reply into audio using ElevenLabs. The agent just replies with plain text. This is what powers voice conversations via the iOS app, TUI, and control UI.
tts tool (agent-level): The agent explicitly calls the tts tool to generate an MP3 file. Use this for proactive voice messages (e.g., sending a voice memo via iMessage), storytelling with the sag skill, or when you want audio output outside of Talk Mode.

Important: In Talk Mode sessions, the agent should reply with normal text — not call the tts tool. Calling tts and replying NO_REPLY bypasses the gateway’s audio pipeline and the user hears nothing.

Pricing Tiers

Tier	Characters/month	Cost	Notes
Free	10,000	$0	Good for testing
Starter	30,000	$5/mo	Recommended for personal use
Creator	100,000	$22/mo	For heavy voice usage

Monitor usage at elevenlabs.io/app/usage.

Talk Mode & TTS

Talk Mode & TTS

Prerequisites

Configuration

1. Add the ElevenLabs API Key

2. Configure the Talk Section

3. Enable the TTS Tool

4. Configure the SAG Skill (Optional)

5. Restart the Gateway

Choosing a Voice

Using Talk Mode

Via iOS App

Via TUI

Via Control UI

Toggle Per-Session

Voice Replies in iMessage

Cost Considerations

Talk Mode vs the tts Tool

Pricing Tiers

Talk Mode vs the `tts` Tool