Refolk

Top Text-to-speech repositories on GitHub

TTS models and inference servers for generating natural speech.

Ranked by stars across 584 repositories tagged text-to-speech. Refreshed daily.

  1. 1
    unslothai/unsloth66,991 · ⑂ 6,018

    Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.

    • fine-tuning
    • llama
    • llms
    • mistral
    • gemma
    • llama3
  2. 2
    RVC-Boss/GPT-SoVITS58,893 · ⑂ 6,440

    1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

    • text-to-speech
    • tts
    • vits
    • voice-clone
    • voice-cloneai
    • voice-cloning
  3. 3
    coqui-ai/TTS45,590 · ⑂ 6,120

    🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

    • python
    • text-to-speech
    • deep-learning
    • speech
    • pytorch
    • tts
  4. 4
    2noise/ChatTTS39,483 · ⑂ 4,248

    A generative speech model for daily dialogue.

    • agent
    • text-to-speech
    • chat
    • chatgpt
    • chattts
    • chinese
  5. 5
    babysor/MockingBird36,904 · ⑂ 5,205

    🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time

    • ai
    • speech
    • pytorch
    • deep-learning
    • text-to-speech
    • tts
  6. 6
    myshell-ai/OpenVoice36,755 · ⑂ 4,105

    Instant voice cloning by MIT and MyShell. Audio foundation model.

    • text-to-speech
    • tts
    • voice-clone
    • zero-shot-tts
  7. 7
    OpenBMB/VoxCPM31,124 · ⑂ 3,506

    VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning

    • audio
    • deeplearning
    • minicpm
    • python
    • pytorch
    • speech
  8. 8
    FunAudioLLM/CosyVoice21,760 · ⑂ 2,508

    Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

    • audio-generation
    • gpt-4o
    • text-to-speech
    • tts
    • cantonese
    • chatbot
  9. 9
    index-tts/index-tts21,293 · ⑂ 2,620

    An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

    • bigvgan
    • cross-lingual
    • indextts
    • text-to-speech
    • tts
    • voice-clone
  10. 10
    nari-labs/dia19,325 · ⑂ 1,687

    A TTS model capable of generating ultra-realistic dialogue in one pass.

    • ai
    • open-weight
    • text-to-speech
  11. 11
    jianchang512/pyvideotrans18,044 · ⑂ 2,244

    Translate the video from one language to another and embed dubbing & subtitles.

    • text-to-speech
    • video-transition
    • speech-to-text
  12. 12
    leon-ai/leon17,334 · ⑂ 1,446

    🧠 Leon is your open-source personal assistant.

    • leon
    • personal-assistant
    • nodejs
    • python
    • ai
    • artificial-intelligence
  13. 13
    k2-fsa/sherpa-onnx13,089 · ⑂ 1,500

    Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

    • asr
    • onnx
    • windows
    • linux
    • macos
    • cpp
  14. 14
    supertone-inc/supertonic12,518 · ⑂ 1,286

    Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.

    • cpp
    • csharp
    • go
    • ios
    • java
    • lightweight
  15. 15
    rany2/edge-tts11,323 · ⑂ 1,050

    Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

    • tts
    • speech-synthesis
    • text-to-speech
  16. 16
    rhasspy/piper11,125 · ⑂ 1,031

    A fast, local neural text to speech system

    • speech-synthesis
    • text-to-speech
    • tts
  17. 17
    abus-aikorea/voice-pro11,020 · ⑂ 1,604

    Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

    • faster-whisper
    • tts
    • whisper
    • gradio
    • subtitles
    • transcription
  18. 18
    mozilla/TTS10,154 · ⑂ 1,325

    :robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

    • deep-learning
    • text-to-speech
    • python
    • pytorch
    • tacotron
    • tts
  19. 19
    espnet/espnet9,867 · ⑂ 2,412

    End-to-End Speech Processing Toolkit

    • deep-learning
    • end-to-end
    • chainer
    • pytorch
    • kaldi
    • speech-recognition
  20. 20
    open-mmlab/Amphion9,847 · ⑂ 814

    Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

    • audio-generation
    • audio-synthesis
    • audioldm
    • music-generation
    • naturalspeech2
    • singing-voice-conversion
  21. 21
    netease-youdao/EmotiVoice8,476 · ⑂ 755

    EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

    • pytorch
    • speech
    • speech-synthesis
    • tts
    • multi-speaker
    • text-to-speech
  22. 22
    Plachtaa/VALL-E-X7,939 · ⑂ 778

    An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

    • emotional-speech
    • gpt
    • text-to-speech
    • voice-clone
    • transformer-architecture
    • tts
  23. 23
    jaywalnut310/vits7,866 · ⑂ 1,388

    VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

    • tts
    • text-to-speech
    • pytorch
    • deep-learning
    • speech-synthesis
  24. 24
    myshell-ai/MeloTTS7,502 · ⑂ 1,049

    High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

    • text-to-speech
    • tts
    • chinese
    • english
    • french
    • japanese
  25. 25
    Blaizzy/mlx-audio7,402 · ⑂ 643

    A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

    • apple-silicon
    • audio-processing
    • mlx
    • multimodal
    • speech-recognition
    • speech-synthesis

Find engineers shipping Text-to-speech

The list above ranks the most-starred public repositories tagged with the Text-to-speech topic, drawn from the public GitHub graph. Across 584 repositories tagged this way, the maintainers and top contributors are a tight cluster of the people actually building Text-to-speech.

Looking for engineers who’ve worked on Text-to-speech for real, not just listed it on LinkedIn? The fastest path is the contributor list of these repos. Their commits, issues, and READMEs are public proof of depth.

Refolk turns this list into a search. Ask for “maintainers of top Text-to-speech repos who are hiring”, Text-to-speech engineers in San Francisco”, or “founders shipping Text-to-speech” and Refolk returns a ranked shortlist with sources.

How this list is built

Refolk searched GitHub for public repositories tagged with the Text-to-speech topic, ranked them by stargazer count, and kept those with at least 50 stars. The list refreshes once a day.

Last refreshed: Sun, 21 Jun 2026 08:18:46 GMT

Need a list like this for any search?

Refolk runs natural-language searches across GitHub, LinkedIn, and the open web. Try one of these:

Browse other topics

See all repository lists.

Text-to-speech by language