Skip to main content
POST
/
api
/
v1
/
tts
/
sse
TTS Stream
curl --request POST \
  --url https://api.vachana.ai/api/v1/tts/sse \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key-ID: <x-api-key-id>' \
  --data '
{
  "audio_config": {
    "bitrate": "192k",
    "container": "mp3",
    "encoding": "linear_pcm",
    "num_channels": 1,
    "sample_rate": 44100,
    "sample_width": 2
  },
  "model": "vachana-voice-v3",
  "text": "नमस्ते, आप कैसे हैं?"
}
'
"event: audio_chunk\ndata: UklGRiQAAABXQVZFZm10IBAAAAABAAEAQB8AAEAfAAABAAgAZGF0YQAAAAA=\n\nevent: completed\ndata: {\"status\": \"success\"}\n"

Overview

1

Step 1 — Generate a voice embedding

Upload a reference audio clip to get a speaker_embedding. See Voice Clone Embeddings.
2

Step 2 — Synthesize with your cloned voice — this page

Pass the speaker_embedding from Step 1 to this endpoint to stream cloned voice audio progressively.
Stream cloned voice audio as it’s generated using Server-Sent Events. Pass the speaker_embedding from Voice Clone Embeddings to use your cloned voice. Reduces latency compared to Voice Cloned TTS REST. For the lowest latency, see Voice Cloned TTS Realtime.

Endpoint

POST https://api.vachana.ai/api/v1/tts/sse

Authentication

HeaderRequiredDescriptionExample
X-API-Key-IDYesYour API key for authenticationyour-api-key-id
Content-TypeYesMust be application/jsonapplication/json

Request Body

text
string
required
The text to synthesize into speech
model
string
required
Voice cloning model to use. Currently supported: vachana-vc-v1
audio_config
object
required
Audio output configuration
speaker_embedding
object
required
Voice clone embedding obtained from the Voice Clone Embeddings endpoint

Response

The server streams audio data via Server-Sent Events (SSE). Each event contains a chunk of audio data encoded in base64.

Event Types

audio_chunk
event
Contains base64-encoded audio data
event: audio_chunk
data: UklGRiQAAABXQVZFZm10IBAAAAABAAEAQB8AAEAfAAABAAgAZGF0YQAAAAA=
completed
event
Signals the end of the audio stream
event: completed
data: {"status": "success"}

Example Request

curl -X POST https://api.vachana.ai/api/v1/tts/sse \
  -H "X-API-Key-ID: your-api-key-id" \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -N \
  -d '{
    "text": "नमस्ते, आप कैसे हैं?",
    "model": "vachana-vc-v1",
    "audio_config": {
      "sample_rate": 44100,
      "encoding": "linear_pcm",
      "container": "wav"
    },
    "speaker_embedding": {
      "embedding": "your-embedding-string",
      "shape": [1, 768],
      "dtype": "torch.bfloat16"
    }
  }'

Error Responses

400
Bad Request
Invalid text or audio configuration
{
  "success": false,
  "error": {
    "type": "INVALID_REQUEST_ERROR",
    "message": "Invalid text or audio configuration."
  }
}
429
Too Many Requests
Rate limit exceeded
{
  "success": false,
  "error": {
    "type": "RATE_LIMIT_ERROR",
    "message": "Rate limit exceeded. Please try again later."
  }
}
500
Internal Server Error
Unexpected error occurred
{
  "success": false,
  "error": {
    "type": "API_ERROR",
    "message": "An unexpected error occurred while processing."
  }
}

Headers

X-API-Key-ID
string
required

Body

application/json

Request body for TTS inference.

text
string
required
model
enum<string>
required

Supported TTS models.

Available options:
vachana-voice-v3
audio_config
AudioConfig · object
required

Audio output configuration.

voice
enum<string>

ID of a pre-defined voice. Ignored if speaker_embedding is provided.

Available options:
Karan,
Simran,
Nara,
Riya,
Viraj,
Raju

Response

Successful Server-Sent Events stream

The response is of type string.