Skip to main content
Inworld is a streaming text-to-speech (TTS) provider. In OpenClaw it synthesizes outbound reply audio (MP3 by default, OGG_OPUS for voice notes) and PCM audio for telephony channels such as Voice Call. OpenClaw posts to Inworld’s streaming TTS endpoint, concatenates the returned base64 audio chunks into a single buffer, and hands the result to the standard reply-audio pipeline.
DetailValue
Websiteinworld.ai
Docsdocs.inworld.ai/tts/tts
AuthINWORLD_API_KEY (HTTP Basic, Base64 dashboard credential)
Default voiceSarah
Default modelinworld-tts-1.5-max

Getting started

1

Set your API key

Copy the credential from your Inworld dashboard (Workspace > API Keys) and set it as an env var. The value is sent verbatim as the HTTP Basic credential, so do not Base64-encode it again or convert it to a bearer token.
INWORLD_API_KEY=<base64-credential-from-dashboard>
2

Select Inworld in messages.tts

{
  messages: {
    tts: {
      auto: "always",
      provider: "inworld",
      providers: {
        inworld: {
          voiceId: "Sarah",
          modelId: "inworld-tts-1.5-max",
        },
      },
    },
  },
}
3

Send a message

Send a reply through any connected channel. OpenClaw synthesizes the audio with Inworld and delivers it as MP3 (or OGG_OPUS when the channel expects a voice note).

Configuration options

OptionPathDescription
apiKeymessages.tts.providers.inworld.apiKeyBase64 dashboard credential. Falls back to INWORLD_API_KEY.
baseUrlmessages.tts.providers.inworld.baseUrlOverride Inworld API base URL (default https://api.inworld.ai).
voiceIdmessages.tts.providers.inworld.voiceIdVoice identifier (default Sarah).
modelIdmessages.tts.providers.inworld.modelIdTTS model id (default inworld-tts-1.5-max).
temperaturemessages.tts.providers.inworld.temperatureSampling temperature 0..2 (optional).

Notes

Inworld uses HTTP Basic auth with a single Base64-encoded credential string. Copy it verbatim from the Inworld dashboard. The provider sends it as Authorization: Basic <apiKey> without any further encoding, so do not Base64-encode it yourself and do not pass a bearer-style token. See TTS auth notes for the same callout.
Supported model ids: inworld-tts-1.5-max (default), inworld-tts-1.5-mini, inworld-tts-1-max, inworld-tts-1.
Replies use MP3 by default. When the channel target is voice-note OpenClaw asks Inworld for OGG_OPUS so the audio plays as a native voice bubble. Telephony synthesis uses raw PCM at 22050 Hz to feed the telephony bridge.
Override the API host with messages.tts.providers.inworld.baseUrl. Trailing slashes are stripped before requests are sent.

Text-to-speech

TTS overview, providers, and messages.tts config.

Configuration

Full config reference including messages.tts settings.

Providers

All bundled OpenClaw providers.

Troubleshooting

Common issues and debugging steps.