Overview
Step 1 — Generate a voice embedding
Upload a reference audio clip to get a
speaker_embedding. See Voice Clone Embeddings.speaker_embedding from Voice Clone Embeddings to use your cloned voice. For simpler use cases, see Voice Cloning REST or Voice Cloning Streaming.
Endpoint
Authentication
All Realtime connections require the following headers:| Header | Required | Description | Example |
|---|---|---|---|
Content-Type | Yes | Must be application/json | application/json |
X-API-Key-ID | Yes | Your API key for authentication | <your-api-key-id> |
Request Format
Send a JSON message with the following structure:Response
The server streams audio data in real-time as binary chunks. Each chunk contains PCM audio data according to the specifiedaudio_config.