Refolk

Top Text-to-speech repositories on GitHub

TTS models and inference servers for generating natural speech.

Ranked by stars across 556 repositories tagged text-to-speech. Refreshed daily.

  1. 1
    unslothai/unsloth63,724 · ⑂ 5,615

    Web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.

    • fine-tuning
    • llama
    • llms
    • mistral
    • gemma
    • llama3
  2. 2
    RVC-Boss/GPT-SoVITS57,248 · ⑂ 6,246

    1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

    • text-to-speech
    • tts
    • vits
    • voice-clone
    • voice-cloneai
    • voice-cloning
  3. 3
    coqui-ai/TTS45,242 · ⑂ 6,073

    🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

    • python
    • text-to-speech
    • deep-learning
    • speech
    • pytorch
    • tts
  4. 4
    2noise/ChatTTS39,217 · ⑂ 4,249

    A generative speech model for daily dialogue.

    • agent
    • text-to-speech
    • chat
    • chatgpt
    • chattts
    • chinese
  5. 5
    babysor/MockingBird36,897 · ⑂ 5,217

    🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time

    • ai
    • speech
    • pytorch
    • deep-learning
    • text-to-speech
    • tts
  6. 6
    myshell-ai/OpenVoice36,469 · ⑂ 4,073

    Instant voice cloning by MIT and MyShell. Audio foundation model.

    • text-to-speech
    • tts
    • voice-clone
    • zero-shot-tts
  7. 7
    FunAudioLLM/CosyVoice20,904 · ⑂ 2,408

    Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

    • audio-generation
    • gpt-4o
    • text-to-speech
    • tts
    • cantonese
    • chatbot
  8. 8
    index-tts/index-tts20,368 · ⑂ 2,507

    An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

    • bigvgan
    • cross-lingual
    • indextts
    • text-to-speech
    • tts
    • voice-clone
  9. 9
    nari-labs/dia19,291 · ⑂ 1,683

    A TTS model capable of generating ultra-realistic dialogue in one pass.

    • ai
    • open-weight
    • text-to-speech
  10. 10
    OpenBMB/VoxCPM17,517 · ⑂ 2,086

    VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning

    • audio
    • deeplearning
    • minicpm
    • python
    • pytorch
    • speech
  11. 11
    jianchang512/pyvideotrans17,334 · ⑂ 2,130

    Translate the video from one language to another and embed dubbing & subtitles.

    • text-to-speech
    • video-transition
    • speech-to-text
  12. 12
    leon-ai/leon17,216 · ⑂ 1,443

    🧠 Leon is your open-source personal assistant.

    • leon
    • personal-assistant
    • nodejs
    • python
    • ai
    • artificial-intelligence
  13. 13
    k2-fsa/sherpa-onnx12,058 · ⑂ 1,369

    Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

    • asr
    • onnx
    • windows
    • linux
    • macos
    • cpp
  14. 14
    rhasspy/piper10,905 · ⑂ 971

    A fast, local neural text to speech system

    • speech-synthesis
    • text-to-speech
    • tts
  15. 15
    rany2/edge-tts10,795 · ⑂ 1,008

    Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

    • tts
    • speech-synthesis
    • text-to-speech
  16. 16
    mozilla/TTS10,137 · ⑂ 1,323

    :robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

    • deep-learning
    • text-to-speech
    • python
    • pytorch
    • tacotron
    • tts
  17. 17
    espnet/espnet9,828 · ⑂ 2,399

    End-to-End Speech Processing Toolkit

    • deep-learning
    • end-to-end
    • chainer
    • pytorch
    • kaldi
    • speech-recognition
  18. 18
    open-mmlab/Amphion9,788 · ⑂ 811

    Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

    • audio-generation
    • audio-synthesis
    • audioldm
    • music-generation
    • naturalspeech2
    • singing-voice-conversion
  19. 19
    abus-aikorea/voice-pro9,143 · ⑂ 1,226

    Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

    • faster-whisper
    • tts
    • whisper
    • gradio
    • subtitles
    • transcription
  20. 20
    netease-youdao/EmotiVoice8,477 · ⑂ 750

    EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

    • pytorch
    • speech
    • speech-synthesis
    • tts
    • multi-speaker
    • text-to-speech
  21. 21
    Plachtaa/VALL-E-X7,948 · ⑂ 780

    An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

    • emotional-speech
    • gpt
    • text-to-speech
    • voice-clone
    • transformer-architecture
    • tts
  22. 22
    jaywalnut310/vits7,849 · ⑂ 1,389

    VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

    • tts
    • text-to-speech
    • pytorch
    • deep-learning
    • speech-synthesis
  23. 23
    myshell-ai/MeloTTS7,403 · ⑂ 1,039

    High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

    • text-to-speech
    • tts
    • chinese
    • english
    • french
    • japanese
  24. 24
    Blaizzy/mlx-audio6,953 · ⑂ 579

    A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

    • apple-silicon
    • audio-processing
    • mlx
    • multimodal
    • speech-recognition
    • speech-synthesis
  25. 25
    espeak-ng/espeak-ng6,426 · ⑂ 1,218

    eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

    • espeak-ng
    • espeak
    • android
    • text-to-speech
    • speech-synthesis

Find engineers shipping Text-to-speech

The list above ranks the most-starred public repositories tagged with the Text-to-speech topic, drawn from the public GitHub graph. Across 556 repositories tagged this way, the maintainers and top contributors are a tight cluster of the people actually building Text-to-speech.

Looking for engineers who’ve worked on Text-to-speech for real, not just listed it on LinkedIn? The fastest path is the contributor list of these repos. Their commits, issues, and READMEs are public proof of depth.

Refolk turns this list into a search. Ask for “maintainers of top Text-to-speech repos who are hiring”, Text-to-speech engineers in San Francisco”, or “founders shipping Text-to-speech” and Refolk returns a ranked shortlist with sources.

How this list is built

Refolk searched GitHub for public repositories tagged with the Text-to-speech topic, ranked them by stargazer count, and kept those with at least 50 stars. The list refreshes once a day.

Last refreshed: Thu, 07 May 2026 05:55:57 GMT

Need a list like this for any search?

Refolk runs natural-language searches across GitHub, LinkedIn, and the open web. Try one of these:

Browse other topics

See all repository lists.

Text-to-speech by language