TTS Stream
Voice Cloning
Voice Cloned TTS (Streaming)
Stream cloned voice audio in chunks via Server-Sent Events.
POST
TTS Stream
Overview
Step 1 — Generate a voice embedding
Upload a reference audio clip to get a
speaker_embedding. See Voice Clone Embeddings.speaker_embedding from Voice Clone Embeddings to use your cloned voice. Reduces latency compared to Voice Cloned TTS REST. For the lowest latency, see Voice Cloned TTS Realtime.
Endpoint
Authentication
| Header | Required | Description | Example |
|---|---|---|---|
X-API-Key-ID | Yes | Your API key for authentication | your-api-key-id |
Content-Type | Yes | Must be application/json | application/json |
Request Body
The text to synthesize into speech
Voice cloning model to use. Currently supported:
vachana-vc-v1Audio output configuration
Voice clone embedding obtained from the Voice Clone Embeddings endpoint
Response
The server streams audio data via Server-Sent Events (SSE). Each event contains a chunk of audio data encoded in base64.Event Types
Contains base64-encoded audio data
Signals the end of the audio stream
Example Request
Error Responses
Invalid text or audio configuration
Rate limit exceeded
Unexpected error occurred
Headers
Body
application/json
Request body for TTS inference.
Response
Successful Server-Sent Events stream
The response is of type string.