Sinhala text-to-speech. OpenAI-compatible POST /v1/audio/speech. First call after idle cold-starts (~40–60 s).
Health: —
Playback speed 1.00× pitch-preserved · applies to replay/buffered players
This model is a continuation-cloning TTS. Zero-shot, it improvises a voice per request. Pin a reference to get the same voice every time.
Fires N distinct texts in parallel → continuous batching. If batching works, total wall ≪ N × single-latency. With the voice pinned, every clip should also share one voice.
No build step. Keep ref.js next to this file for the built-in anchor voice. Audio is 48 kHz mono.