Overview
Synthesises natural-sounding audio from text. Designed for IVR, conversational agents and short narration. Low footprint (82M parameters) keeps CPU latency low for real-time use.
Endpoint and model
POST `https://api.tesseraai.cloud/v1/audio/speech`. `model`: `kokoro-82m`. `voice`: see catalogue below.
| Attribute | Value |
|---|---|
| Upstream model | hexgrad/Kokoro-82M |
| P50 latency | <200 ms for phrases under 30 chars |
| Output formats | mp3 (default), wav, opus, aac, flac, pcm |
| Sample rate | 24 kHz mono |
| Licence | Apache 2.0 |
Available voices
Each voice encodes language + gender in its prefix: `e*` Spanish, `a*` American English, `b*` British English, `p*` Brazilian Portuguese, `f*` French, `i*` Italian, `j*` Japanese, `z*` Mandarin, `h*` Hindi. The second letter is `f` (female) or `m` (male).
| Language | Voices |
|---|---|
| Spanish (neutral, ES + LATAM) | `ef_dora`, `em_alex`, `em_santa` |
| American English | `af_heart`, `af_sky`, `af_bella`, `af_nicole`, `af_sarah`, `af_aoede`, `af_kore`, `af_jessica`, `af_nova`, `af_river`, `af_jadzia`, `am_michael`, `am_adam`, `am_eric`, `am_fenrir`, `am_liam`, `am_onyx`, `am_puck`, `am_santa` |
| British English | `bf_alice`, `bf_emma`, `bf_isabella`, `bf_lily`, `bm_daniel`, `bm_fable`, `bm_george`, `bm_lewis` |
| Brazilian Portuguese | `pf_dora`, `pm_alex`, `pm_santa` |
| French | `ff_siwis` |
| Italian | `if_sara`, `im_nicola` |
| Japanese | `jf_alpha`, `jf_gongitsune`, `jf_nezumi`, `jf_tebukuro`, `jm_kumo` |
| Mandarin | `zf_xiaobei`, `zf_xiaoni`, `zf_xiaoxiao`, `zf_xiaoyi`, `zm_yunjian`, `zm_yunxi`, `zm_yunxia`, `zm_yunyang` |
| Hindi | `hf_alpha`, `hf_beta`, `hm_omega`, `hm_psi` |
Hit Play to hear any voice
54 voices, 9 languages. Filter by language and gender; copy the ID to your clipboard to drop straight into your request.
Request
curl https://api.tesseraai.cloud/v1/audio/speech \
-H "Authorization: Bearer $TESSERA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "kokoro-82m",
"input": "Welcome to Tessera. Your request has been received.",
"voice": "af_heart",
"response_format": "mp3"
}' \
--output welcome.mp3Response
The response body is raw audio in the requested format. No JSON wrapper.