Top Python Speech recognition repositories on GitHub

ASR models and pipelines for converting audio to text. Filtered to projects whose primary language is Python.

Ranked by stars across 460 Python repositories tagged speech-recognition. Refreshed daily.

1
huggingface/transformers★ 161,766 · ⑂ 33,564
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
- nlp
- natural-language-processing
- pytorch
- pytorch-transformers
- transformer
- model-hub
2
SYSTRAN/faster-whisper★ 23,761 · ⑂ 1,949
Faster Whisper transcription with CTranslate2
- deep-learning
- inference
- quantization
- speech-recognition
- speech-to-text
- transformer
3
m-bain/whisperX★ 22,584 · ⑂ 2,311
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
- asr
- speech
- speech-recognition
- speech-to-text
- whisper
4
modelscope/FunASR★ 18,390 · ⑂ 1,870
Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.
- pytorch
- speech-recognition
- paraformer
- punctuation
- speaker-diarization
- voice-activity-detection
5
PaddlePaddle/PaddleSpeech★ 12,622 · ⑂ 1,958
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
- transformer
- conformer
- speech-translation
- streaming-asr
- speech-alignment
- punctuation-restoration
6
speechbrain/speechbrain★ 11,641 · ⑂ 1,702
A PyTorch-based Speech Toolkit
- speech-recognition
- speech-toolkit
- speaker-recognition
- speech-to-text
- speech-enhancement
- speech-separation
7
abus-aikorea/voice-pro★ 11,020 · ⑂ 1,604
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
- faster-whisper
- tts
- whisper
- gradio
- subtitles
- transcription
8
espnet/espnet★ 9,867 · ⑂ 2,412
End-to-End Speech Processing Toolkit
- deep-learning
- end-to-end
- chainer
- pytorch
- kaldi
- speech-recognition
9
Uberi/speech_recognition★ 8,969 · ⑂ 2,420
Speech recognition module for Python, supporting several engines and APIs, online and offline.
- python
- audio
- speech-recognition
- speech-to-text
10
nl8590687/ASRT_SpeechRecognition★ 8,376 · ⑂ 1,898
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
- tensorflow
- cnn
- ctc
- python
- keras
- speech-recognition
11
Blaizzy/mlx-audio★ 7,403 · ⑂ 643
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
- apple-silicon
- audio-processing
- mlx
- multimodal
- speech-recognition
- speech-synthesis
12
debpalash/OmniVoice-Studio★ 7,361 · ⑂ 1,129
The open-source ElevenLabs alternative for local voice cloning, design, create, dubbing and dictation Desktop App
- tts
- voice-cloning
- voice-generation
- voice-ai
- asr
- elevenlabs
13
PaddlePaddle/PaddleX★ 6,158 · ⑂ 1,198
All-in-One Development Tool based on PaddlePaddle
- classification
- segmentation
- deployment
- ocr
- time-series
- pp-chatocr
14
modelscope/FunClip★ 5,835 · ⑂ 704
Open-source, accurate and easy-to-use video speech recognition & clipping tool. LLM-based AI clipping integrated.
- speech-recognition
- video-clip
- video-subtitles
- subtitles-generator
- speech-to-text
- gradio
15
wenet-e2e/wenet★ 5,146 · ⑂ 1,183
Production First and Production Ready End-to-End Speech Recognition Toolkit
- e2e-models
- pytorch
- asr
- transformer
- conformer
- production-ready
16
Picovoice/porcupine★ 4,864 · ⑂ 578
On-device wake word detection powered by deep learning
- wake-word-detection
- hotword
- keyword-spotting
- keyword-spotter
- wake-word
- wake-word-engine
17
yanshengjia/ml-road★ 4,831 · ⑂ 1,707
Machine Learning and Agentic AI Resources, Practice and Research
- machine-learning
- deep-learning
- nlp
- computer-vision
- speech-recognition
- tensorflow
18
jianchang512/stt★ 4,626 · ⑂ 489
Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具，输出json、srt字幕、纯文字格式
- speech
- speech-recognition
- speech-to-text
- stt
19
huggingface/distil-whisper★ 4,085 · ⑂ 352
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
- audio
- speech-recognition
- whisper
20
ahmetoner/whisper-asr-webservice★ 3,286 · ⑂ 579
OpenAI Whisper ASR Webservice API
- automatic-speech-recognition
- speech-recognition
- speech-to-text
- openai-whisper
- docker
- asr
21
chenyme/Chenyme-AAVT★ 3,099 · ⑂ 244
这是一个全自动（音频）视频翻译项目。利用Whisper识别声音，AI大模型翻译字幕，最后合并字幕视频，生成翻译后的视频。
- faster-whisper
- gpt-4
- speech-recognition
- video-translation
- whisper
- gpt-4o
22
tensorflow/lingvo★ 2,864 · ⑂ 451
Lingvo
- speech-recognition
- translation
- speech-to-text
- machine-translation
- mnist
- seq2seq
23
zzw922cn/Automatic_Speech_Recognition★ 2,835 · ⑂ 536
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
- automatic-speech-recognition
- tensorflow
- timit-dataset
- feature-vector
- phonemes
- data-preprocessing
24
linto-ai/whisper-timestamped★ 2,819 · ⑂ 209
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
- deep-learning
- speech
- speech-recognition
- speech-to-text
- asr
- machine-learning
25
mravanelli/pytorch-kaldi★ 2,398 · ⑂ 444
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
- speech-recognition
- gru
- dnn
- kaldi
- rnn-model
- pytorch

Find Python engineers shipping Speech recognition

The list above ranks the most-starred public Python repositories tagged with the Speech recognition topic, drawn from the public GitHub graph. Across 460 matching repositories, the contributors are a tight cluster of engineers with both Python chops and real Speech recognition experience.

That overlap is rare. Most Python engineers haven’t shipped Speech recognition, and most Speech recognition maintainers don’t write Python. The people on this list’s contributor graph are the ones who do both.

Refolk turns this list into a search. Ask for “Python Speech recognition maintainers hiring” or “Python engineers shipping Speech recognition in 2025” and Refolk returns a ranked shortlist with the commits, profiles, and projects behind each name.

How this list is built

Refolk searched GitHub for public Python repositories tagged with the Speech recognition topic, ranked them by stargazer count, and kept those with at least 25 stars. The list refreshes once a day.

Last refreshed: Sun, 21 Jun 2026 09:12:50 GMT

Need a more specific search?

Refolk runs natural-language searches across GitHub, LinkedIn, and the open web. Try one of these:

Related lists

See all repository lists.

Find Python engineers shipping Speech recognition

How this list is built

Need a more specific search?

Related lists

Or zoom out