Instructions to use IndexTeam/Index-TTS with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- IndexTTS
How to use IndexTeam/Index-TTS with IndexTTS:
# Download model from huggingface_hub import snapshot_download snapshot_download(IndexTeam/Index-TTS, local_dir="checkpoints") from indextts.infer import IndexTTS # Ensure config.yaml is present in the checkpoints directory tts = IndexTTS(model_dir="checkpoints", cfg_path="checkpoints/config.yaml") voice = "path/to/your/reference_voice.wav" # Path to the voice reference audio file text = "Hello, how are you?" output_path = "output_index.wav" tts.infer(voice, text, output_path)
- Notebooks
- Google Colab
- Kaggle
IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
ππ» IndexTTS ππ»
IndexTTS is a GPT-style text-to-speech (TTS) model mainly based on XTTS and Tortoise. It is capable of correcting the pronunciation of Chinese characters using pinyin and controlling pauses at any position through punctuation marks. We enhanced multiple modules of the system, including the improvement of speaker condition feature representation, and the integration of BigVGAN2 to optimize audio quality. Trained on tens of thousands of hours of data, our system achieves state-of-the-art performance, outperforming current popular TTS systems such as XTTS, CosyVoice2, Fish-Speech, and F5-TTS.
Experience IndexTTS: Please contact xuanwu@bilibili.com for more detailed information.
Acknowledge
π Citation
π If you find our work helpful, please leave us a star and cite our paper.
@article{deng2025indextts,
title={IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System},
author={Wei Deng, Siyi Zhou, Jingchen Shu, Jinchao Wang, Lu Wang},
journal={arXiv preprint arXiv:2502.05512},
year={2025}
}
- Downloads last month
- 144