Text-to-Speech (REST)

TTS Inference

curl --request POST \
  --url https://api.vachana.ai/api/v1/tts/inference \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key-ID: <x-api-key-id>' \
  --data '
{
  "text": "नमस्ते, आप कैसे हैं?",
  "model": "timbre-v2.5",
  "audio_config": {
    "encoding": "linear_pcm",
    "container": "wav",
    "num_channels": 1,
    "sample_rate": 48000,
    "sample_width": 2
  }
}
'

import requests

url = "https://api.vachana.ai/api/v1/tts/inference"

payload = {
    "text": "नमस्ते, आप कैसे हैं?",
    "model": "timbre-v2.5",
    "audio_config": {
        "encoding": "linear_pcm",
        "container": "wav",
        "num_channels": 1,
        "sample_rate": 48000,
        "sample_width": 2
    }
}
headers = {
    "X-API-Key-ID": "<x-api-key-id>",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print(response.text)

const options = {
  method: 'POST',
  headers: {'X-API-Key-ID': '<x-api-key-id>', 'Content-Type': 'application/json'},
  body: JSON.stringify({
    text: 'नमस्ते, आप कैसे हैं?',
    model: 'timbre-v2.5',
    audio_config: {
      encoding: 'linear_pcm',
      container: 'wav',
      num_channels: 1,
      sample_rate: 48000,
      sample_width: 2
    }
  })
};

fetch('https://api.vachana.ai/api/v1/tts/inference', options)
  .then(res => res.json())
  .then(res => console.log(res))
  .catch(err => console.error(err));

"<string>"

{
  "success": false,
  "error": {
    "type": "INVALID_REQUEST_ERROR",
    "message": "Invalid text or audio configuration."
  }
}

{
  "success": false,
  "error": {
    "type": "RATE_LIMIT_ERROR",
    "message": "Rate limit exceeded. Please try again later."
  }
}

{
  "success": false,
  "error": {
    "type": "API_ERROR",
    "message": "An unexpected error occurred while processing."
  }
}

{
  "success": false,
  "error": {
    "type": "SERVICE_UNAVAILABLE",
    "message": "Text-to-speech service is temporarily unavailable."
  }
}

POST

api

tts

inference

TTS Inference

curl --request POST \
  --url https://api.vachana.ai/api/v1/tts/inference \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key-ID: <x-api-key-id>' \
  --data '
{
  "text": "नमस्ते, आप कैसे हैं?",
  "model": "timbre-v2.5",
  "audio_config": {
    "encoding": "linear_pcm",
    "container": "wav",
    "num_channels": 1,
    "sample_rate": 48000,
    "sample_width": 2
  }
}
'

import requests

url = "https://api.vachana.ai/api/v1/tts/inference"

payload = {
    "text": "नमस्ते, आप कैसे हैं?",
    "model": "timbre-v2.5",
    "audio_config": {
        "encoding": "linear_pcm",
        "container": "wav",
        "num_channels": 1,
        "sample_rate": 48000,
        "sample_width": 2
    }
}
headers = {
    "X-API-Key-ID": "<x-api-key-id>",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print(response.text)

const options = {
  method: 'POST',
  headers: {'X-API-Key-ID': '<x-api-key-id>', 'Content-Type': 'application/json'},
  body: JSON.stringify({
    text: 'नमस्ते, आप कैसे हैं?',
    model: 'timbre-v2.5',
    audio_config: {
      encoding: 'linear_pcm',
      container: 'wav',
      num_channels: 1,
      sample_rate: 48000,
      sample_width: 2
    }
  })
};

fetch('https://api.vachana.ai/api/v1/tts/inference', options)
  .then(res => res.json())
  .then(res => console.log(res))
  .catch(err => console.error(err));

"<string>"

{
  "success": false,
  "error": {
    "type": "INVALID_REQUEST_ERROR",
    "message": "Invalid text or audio configuration."
  }
}

{
  "success": false,
  "error": {
    "type": "RATE_LIMIT_ERROR",
    "message": "Rate limit exceeded. Please try again later."
  }
}

{
  "success": false,
  "error": {
    "type": "API_ERROR",
    "message": "An unexpected error occurred while processing."
  }
}

{
  "success": false,
  "error": {
    "type": "SERVICE_UNAVAILABLE",
    "message": "Text-to-speech service is temporarily unavailable."
  }
}

Currently in beta. You’re on the priority waitlist and among the first to get access.

Overview

Get the complete synthesized audio in one response. Best for downloads or batch processing. For streaming playback, see TTS Streaming or TTS Realtime.

Passing numbers, IDs, dates, or currency as raw strings causes mispronunciations. See the Input Formatting Guide for correct formatting of phone numbers, account numbers, PINs, Aadhaar, vehicle registration numbers, GSTIN, currency, and more.

Available Voices

To see the available voices, click here.

Endpoint

POST https://api.vachana.ai/api/v1/tts/inference

Authentication

All requests require the following headers:

Header	Required	Description	Example
`Content-Type`	Yes	Must be `application/json`	`application/json`
`X-API-Key-ID`	Yes	Your API key for authentication	`<your-api-key-id>`

Request Parameters

Send a JSON body with the following structure:

{
  "text": "नमस्ते, आप कैसे हैं?",
  "voice": "Pranav",
  "model": "Timbre v2.0 / Timbre v2.5",
  "audio_config": {
    "sample_rate": 48000,
    "num_channels": 1,
    "sample_width": 2,
    "encoding": "linear_pcm",
    "container": "wav"
  }
}

string

required

The text to synthesize into speech. Pass numbers, dates, and currency as spoken words to avoid mispronunciations — see Input Formatting Guide.

string

required

TTS model to use. Currently supported: Timbre v2.0 / Timbre v2.5

string

ID of a pre-defined voice. See Available Voices

integer

default:"48000"

Sample rate in Hz. Supported: 8000, 16000, 22050, 24000, 44100, 48000.

integer

required

Number of audio channels (e.g., 1 for mono, 2 for stereo)

integer

required

Sample width in bytes (e.g., 2 for 16-bit audio)

string

default:"linear_pcm"

Audio encoding. Options: linear_pcm, pcm_s16le, pcm_mulaw, pcm_alaw, oggopus. For telephony, prefer container=mulaw or container=alaw over this field — both produce the same output. Use oggopus (or container=ogg) for a playable OGG Opus file.

string

default:"wav"

Output container. Options: wav, raw, mp3, ogg, mulaw, alaw. Use ogg for OGG Opus. Use mulaw or alaw for G.711 telephony (forces 8000 Hz regardless of sample_rate).

string

default:"128k"

MP3 bitrate. Only used when container is mp3. Supported: 32k, 64k, 96k, 128k, 192k.

Audio Format Reference

`container`	`encoding`	Output	`sample_rate`	`bitrate`	Content-Type
`wav`	`linear_pcm`	WAV file (with header)	8000–48000 Hz	—	`audio/wav`
`raw`	`linear_pcm`	Raw 16-bit PCM	8000–48000 Hz	—	`application/octet-stream`
`mp3`	—	MP3 file	8000–48000 Hz	`32k`–`192k`	`audio/mpeg`
`ogg`	—	OGG Opus file	8000–48000 Hz	—	`audio/ogg`
`mulaw`	—	Raw G.711 µ-law	forced 8000 Hz	—	`audio/basic`
`alaw`	—	Raw G.711 A-law	forced 8000 Hz	—	`audio/alaw`

Encoding aliases — produce identical output to the container rows above:

`container`	`encoding`	Equivalent to
`raw`	`pcm_mulaw`	`container=mulaw`
`raw`	`pcm_alaw`	`container=alaw`
`raw` or `ogg`	`oggopus`	`container=ogg`

bitrate only applies when container=mp3. container=mulaw/alaw override sample_rate to 8000 Hz.

Response

A successful request returns 200 OK with raw binary audio in the format specified by audio_config.container (for example audio/wav, audio/mpeg, or audio/ogg).

Code Example

curl -X POST https://api.vachana.ai/api/v1/tts/inference \
  -H "Content-Type: application/json" \
  -H "X-API-Key-ID: <your-api-key>" \
  -d '{
    "text": "नमस्ते, आप कैसे हैं?",
    "voice": "Pranav",
    "model": "Timbre v2.0 / Timbre v2.5",
    "audio_config": {
      "sample_rate": 48000,
      "num_channels": 1,
      "sample_width": 2,
      "encoding": "linear_pcm",
      "container": "wav"
    }
  }' \
  --output response.wav

const response = await fetch("https://api.vachana.ai/api/v1/tts/inference", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "X-API-Key-ID": "<your-api-key>",
  },
  body: JSON.stringify({
    text: "नमस्ते, आप कैसे हैं?",
    voice: "Pranav",
    model: "Timbre v2.0 / Timbre v2.5",
    audio_config: {
      sample_rate: 48000,
      num_channels: 1,
      sample_width: 2,
      encoding: "linear_pcm",
      container: "wav",
    },
  }),
});

const audio = await response.arrayBuffer();
// Handle binary audio data
console.log("Received audio:", audio.byteLength, "bytes");

import requests

response = requests.post(
    "https://api.vachana.ai/api/v1/tts/inference",
    headers={
        "Content-Type": "application/json",
        "X-API-Key-ID": "<your-api-key>",
    },
    json={
        "text": "नमस्ते, आप कैसे हैं?",
        "voice": "Pranav",
        "model": "Timbre v2.0 / Timbre v2.5",
        "audio_config": {
            "sample_rate": 48000,
            "num_channels": 1,
            "sample_width": 2,
            "encoding": "linear_pcm",
            "container": "wav",
        },
    },
)

with open("response.wav", "wb") as f:
    f.write(response.content)

Python SDK

The official Python SDK lets you synthesize speech in one line, without constructing JSON payloads or handling binary audio responses manually.

Installation

pip install gnani-vachana

Requires Python 3.10+.

Authentication

from gnani.tts import GnaniTTSClient

client = GnaniTTSClient(api_key="your-api-key")

export GNANI_API_KEY="your-api-key"

from gnani.tts import GnaniTTSClient

client = GnaniTTSClient()

Synthesize Speech

The synthesize method returns the complete audio as bytes, which you can write to a file or pass directly to an audio player.

from gnani.tts import GnaniTTSClient

client = GnaniTTSClient(api_key="your-api-key")

audio = client.synthesize(
    "नमस्ते, आप कैसे हैं?",
    voice="Kaveri",
)

with open("output.wav", "wb") as f:
    f.write(audio)

Custom Audio Config

Control the sample rate, encoding, and container format of the output audio.

from gnani.tts import GnaniTTSClient, AudioConfig

client = GnaniTTSClient(api_key="your-api-key")

audio = client.synthesize(
    "यह एक टेस्ट है",
    voice="Pranav",
    audio_config=AudioConfig(
        sample_rate=48000,
        encoding="linear_pcm",
        container="wav",
    ),
)

with open("output.wav", "wb") as f:
    f.write(audio)

List Available Voices

from gnani.tts import GnaniTTSClient

voices = GnaniTTSClient.supported_voices()
print(voices)

Supported Languages

The Gnani Timbre v2.0 API supports 2 languages.

Language	Native Script	Example
English	Latin	”I am going to the market”
Hindi	Devanagari (हिन्दी)	“मैं बाज़ार जा रहा हूँ”

Headers

X-API-Key-ID

string

required

Body

application/json

Request body for TTS inference.

text

string

required

model

enum<string>

required

Supported TTS models.

Available options:

timbre-v2.5,

timbre-v2.0

audio_config

AudioConfig · object

required

Audio output configuration.

Show child attributes

voice

string

Voice name from the Timbre catalog.

language

string

Language code. Use IND-IN for Timbre v2.0. For Timbre v2.5 use auto, hi-IN, en-IN, ta-IN, te-IN, kn-IN, ml-IN, mr-IN, pa-IN, bn-IN, or gu-IN.

speed

number

Playback speed multiplier (Timbre v2.5 only). Range: 0.85–1.15.

Response

Successful audio synthesis

The response is of type file.

Gnani APIs

APIs

Use Cases

Text-to-Speech (REST)

Overview

Available Voices

Endpoint

Authentication

Request Parameters

Audio Format Reference

Response

Code Example

Python SDK

Installation

Authentication

Synthesize Speech

Custom Audio Config

List Available Voices

Supported Languages

Headers

Body

Response

​Overview

​Available Voices

​Endpoint

​Authentication

​Request Parameters

​Audio Format Reference

​Response

​Code Example

​Python SDK

​Installation

​Authentication

​Synthesize Speech

​Custom Audio Config

​List Available Voices

​Supported Languages

Headers

Body

Response

Overview

Available Voices

Endpoint

Authentication

Request Parameters

Audio Format Reference

Response

Code Example

Python SDK

Installation

Authentication

Synthesize Speech

Custom Audio Config

List Available Voices

Supported Languages