Refolk

Top Python Text-to-speech repositories on GitHub

TTS models and inference servers for generating natural speech. Filtered to projects whose primary language is Python.

Ranked by stars across 437 Python repositories tagged text-to-speech. Refreshed daily.

  1. 1
    unslothai/unsloth66,992 · ⑂ 6,018

    Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.

    • fine-tuning
    • llama
    • llms
    • mistral
    • gemma
    • llama3
  2. 2
    RVC-Boss/GPT-SoVITS58,893 · ⑂ 6,440

    1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

    • text-to-speech
    • tts
    • vits
    • voice-clone
    • voice-cloneai
    • voice-cloning
  3. 3
    coqui-ai/TTS45,590 · ⑂ 6,120

    🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

    • python
    • text-to-speech
    • deep-learning
    • speech
    • pytorch
    • tts
  4. 4
    2noise/ChatTTS39,483 · ⑂ 4,248

    A generative speech model for daily dialogue.

    • agent
    • text-to-speech
    • chat
    • chatgpt
    • chattts
    • chinese
  5. 5
    babysor/MockingBird36,904 · ⑂ 5,205

    🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time

    • ai
    • speech
    • pytorch
    • deep-learning
    • text-to-speech
    • tts
  6. 6
    myshell-ai/OpenVoice36,755 · ⑂ 4,105

    Instant voice cloning by MIT and MyShell. Audio foundation model.

    • text-to-speech
    • tts
    • voice-clone
    • zero-shot-tts
  7. 7
    OpenBMB/VoxCPM31,124 · ⑂ 3,506

    VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning

    • audio
    • deeplearning
    • minicpm
    • python
    • pytorch
    • speech
  8. 8
    FunAudioLLM/CosyVoice21,760 · ⑂ 2,508

    Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

    • audio-generation
    • gpt-4o
    • text-to-speech
    • tts
    • cantonese
    • chatbot
  9. 9
    index-tts/index-tts21,293 · ⑂ 2,620

    An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

    • bigvgan
    • cross-lingual
    • indextts
    • text-to-speech
    • tts
    • voice-clone
  10. 10
    nari-labs/dia19,325 · ⑂ 1,687

    A TTS model capable of generating ultra-realistic dialogue in one pass.

    • ai
    • open-weight
    • text-to-speech
  11. 11
    jianchang512/pyvideotrans18,044 · ⑂ 2,244

    Translate the video from one language to another and embed dubbing & subtitles.

    • text-to-speech
    • video-transition
    • speech-to-text
  12. 12
    rany2/edge-tts11,323 · ⑂ 1,050

    Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

    • tts
    • speech-synthesis
    • text-to-speech
  13. 13
    abus-aikorea/voice-pro11,020 · ⑂ 1,604

    Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

    • faster-whisper
    • tts
    • whisper
    • gradio
    • subtitles
    • transcription
  14. 14
    espnet/espnet9,867 · ⑂ 2,412

    End-to-End Speech Processing Toolkit

    • deep-learning
    • end-to-end
    • chainer
    • pytorch
    • kaldi
    • speech-recognition
  15. 15
    open-mmlab/Amphion9,847 · ⑂ 814

    Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

    • audio-generation
    • audio-synthesis
    • audioldm
    • music-generation
    • naturalspeech2
    • singing-voice-conversion
  16. 16
    netease-youdao/EmotiVoice8,476 · ⑂ 755

    EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

    • pytorch
    • speech
    • speech-synthesis
    • tts
    • multi-speaker
    • text-to-speech
  17. 17
    Plachtaa/VALL-E-X7,939 · ⑂ 778

    An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

    • emotional-speech
    • gpt
    • text-to-speech
    • voice-clone
    • transformer-architecture
    • tts
  18. 18
    jaywalnut310/vits7,866 · ⑂ 1,388

    VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

    • tts
    • text-to-speech
    • pytorch
    • deep-learning
    • speech-synthesis
  19. 19
    myshell-ai/MeloTTS7,502 · ⑂ 1,049

    High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

    • text-to-speech
    • tts
    • chinese
    • english
    • french
    • japanese
  20. 20
    Blaizzy/mlx-audio7,402 · ⑂ 643

    A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

    • apple-silicon
    • audio-processing
    • mlx
    • multimodal
    • speech-recognition
    • speech-synthesis
  21. 21
    debpalash/OmniVoice-Studio7,356 · ⑂ 1,129

    The open-source ElevenLabs alternative for local voice cloning, design, create, dubbing and dictation Desktop App

    • tts
    • voice-cloning
    • voice-generation
    • voice-ai
    • asr
    • elevenlabs
  22. 22
    calesthio/OpenMontage7,337 · ⑂ 1,178

    World's first open-source, agentic video production system. 12 pipelines, 52 tools, 500+ agent skills. Turn your AI coding assistant into a full video production studio.

    • agent
    • agentic-ai
    • ai
    • claude
    • copilot
    • cursor
  23. 23
    yl4579/StyleTTS26,292 · ⑂ 691

    StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

    • deep-learning
    • pytorch
    • speaker-adaptation
    • speech-synthesis
    • text-to-speech
    • tts
  24. 24
    remsky/Kokoro-FastAPI5,051 · ⑂ 833

    Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/multiplatform CPU, AMD, NVIDIA GPU PyTorch support, handling, and auto-stitching

    • fastapi
    • tts
    • tts-api
    • huggingface-spaces
    • kokoro
    • kokoro-tts
  25. 25
    denizsafak/abogen4,897 · ⑂ 328

    Generate audiobooks from EPUBs, PDFs and text with synchronized captions.

    • audiobook
    • audiobooks
    • content-creation
    • content-creator
    • epub-converter
    • kokoro

Find Python engineers shipping Text-to-speech

The list above ranks the most-starred public Python repositories tagged with the Text-to-speech topic, drawn from the public GitHub graph. Across 437 matching repositories, the contributors are a tight cluster of engineers with both Python chops and real Text-to-speech experience.

That overlap is rare. Most Python engineers haven’t shipped Text-to-speech, and most Text-to-speech maintainers don’t write Python. The people on this list’s contributor graph are the ones who do both.

Refolk turns this list into a search. Ask for Python Text-to-speech maintainers hiring” or Python engineers shipping Text-to-speech in 2025” and Refolk returns a ranked shortlist with the commits, profiles, and projects behind each name.

How this list is built

Refolk searched GitHub for public Python repositories tagged with the Text-to-speech topic, ranked them by stargazer count, and kept those with at least 25 stars. The list refreshes once a day.

Last refreshed: Sun, 21 Jun 2026 08:15:10 GMT

Need a more specific search?

Refolk runs natural-language searches across GitHub, LinkedIn, and the open web. Try one of these:

Related lists

See all repository lists.

Or zoom out