Yousuf Hossain yousuf-hossain-shanto

Text-to-Speech Pipeline with Kokoro TTS

A Python script that converts text into natural-sounding speech using the Kokoro TTS engine. The script processes a transcript file, generates speech segments, and merges them into a single audio file.

Features:

Reads text from a transcript file
Generates speech segments with customizable voice and speed settings
Saves individual audio segments and their corresponding text
Merges all audio segments into a single WAV file using FFmpeg
Organizes output in timestamped directories