Voice Cloned TTS (REST)

TTS Inference

curl --request POST \
  --url https://api.vachana.ai/api/v1/tts/inference \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key-ID: <x-api-key-id>' \
  --data '
{
  "audio_config": {
    "bitrate": "192k",
    "container": "mp3",
    "encoding": "linear_pcm",
    "num_channels": 1,
    "sample_rate": 44100,
    "sample_width": 2
  },
  "model": "vachana-voice-v3",
  "text": "नमस्ते, आप कैसे हैं?"
}
'

"<string>"

POST

api

tts

inference

TTS Inference

curl --request POST \
  --url https://api.vachana.ai/api/v1/tts/inference \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key-ID: <x-api-key-id>' \
  --data '
{
  "audio_config": {
    "bitrate": "192k",
    "container": "mp3",
    "encoding": "linear_pcm",
    "num_channels": 1,
    "sample_rate": 44100,
    "sample_width": 2
  },
  "model": "vachana-voice-v3",
  "text": "नमस्ते, आप कैसे हैं?"
}
'

"<string>"

Overview

Step 1 — Generate a voice embedding

Upload a reference audio clip to get a speaker_embedding. See Voice Clone Embeddings.

Step 2 — Synthesize with your cloned voice — this page

Pass the speaker_embedding from Step 1 to this endpoint to synthesize audio in your cloned voice.

Synthesize audio using your cloned voice. Pass the speaker_embedding obtained from Voice Clone Embeddings along with your text. The full audio is returned in one response. For streaming playback, see Voice Cloning Streaming or Voice Cloning Realtime.

Endpoint

POST https://api.vachana.ai/api/v1/tts/inference

Authentication

Header	Required	Description	Example
`X-API-Key-ID`	Yes	Your API key for authentication	`your-api-key-id`
`Content-Type`	Yes	Must be `application/json`	`application/json`

Request Body

text

string

required

The text to synthesize into speech

model

string

required

Voice cloning model to use. Currently supported: vachana-vc-v1

audio_config

object

required

Audio output configuration

Show properties

sample_rate

integer

Sample rate in Hz (8000-44100)

num_channels

integer

Number of audio channels (1-8)

sample_width

integer

Sample width in bytes (1-4)

encoding

string

Audio encoding format: linear_pcm or oggopus

container

string

Audio container format: raw, mp3, wav, mulaw, or ogg

bitrate

string

MP3 bitrate (only when container=mp3): 96k, 128k, or 192k

speaker_embedding

object

required

Voice clone embedding obtained from the Voice Clone Embeddings endpoint

Show properties

embedding

string

required

The voice clone embedding string

shape

array

required

Shape of the embedding tensor, e.g., [1, 768]

dtype

string

required

Data type of the embedding, e.g., torch.bfloat16

Response

Returns binary audio data in the format specified by audio_config.container:

audio/wav for WAV files
audio/mpeg for MP3 files
audio/ogg for OGG files

Example Request

curl -X POST https://api.vachana.ai/api/v1/tts/inference \
  -H "X-API-Key-ID: your-api-key-id" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "नमस्ते, आप कैसे हैं?",
    "model": "vachana-vc-v1",
    "audio_config": {
      "sample_rate": 44100,
      "encoding": "linear_pcm",
      "container": "wav"
    },
    "speaker_embedding": {
      "embedding": "your-embedding-string",
      "shape": [1, 768],
      "dtype": "torch.bfloat16"
    }
  }' \
  --output audio.wav

Error Responses

400

Bad Request

Invalid text or audio configuration

{
  "success": false,
  "error": {
    "type": "INVALID_REQUEST_ERROR",
    "message": "Invalid text or audio configuration."
  }
}

429

Too Many Requests

Rate limit exceeded

{
  "success": false,
  "error": {
    "type": "RATE_LIMIT_ERROR",
    "message": "Rate limit exceeded. Please try again later."
  }
}

500

Internal Server Error

Unexpected error occurred

{
  "success": false,
  "error": {
    "type": "API_ERROR",
    "message": "An unexpected error occurred while processing."
  }
}

Headers

X-API-Key-ID

string

required

Body

application/json

Request body for TTS inference.

text

string

required

model

enum<string>

required

Supported TTS models.

Available options:

vachana-voice-v3

audio_config

AudioConfig · object

required

Audio output configuration.

Show child attributes

voice

enum<string>

ID of a pre-defined voice. Ignored if speaker_embedding is provided.

Available options:

Karan,

Simran,

Nara,

Riya,

Viraj,

Raju

Response

Successful audio synthesis

The response is of type file.

Voice Clone Embeddings Voice Cloned TTS (Streaming)

​Overview

​Endpoint

​Authentication

​Request Body

​Response

​Example Request

​Error Responses

Headers

Body

Response

Overview

Endpoint

Authentication

Request Body

Response

Example Request

Error Responses