Top Python Speech recognition repositories on GitHub
ASR models and pipelines for converting audio to text. Filtered to projects whose primary language is Python.
Ranked by stars across 444 Python repositories tagged speech-recognition. Refreshed daily.
- 1huggingface/transformers★ 160,334 · ⑂ 33,126
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
- nlp
- natural-language-processing
- pytorch
- pytorch-transformers
- transformer
- model-hub
- 2SYSTRAN/faster-whisper★ 22,697 · ⑂ 1,854
Faster Whisper transcription with CTranslate2
- deep-learning
- inference
- quantization
- speech-recognition
- speech-to-text
- transformer
- 3m-bain/whisperX★ 21,737 · ⑂ 2,254
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
- asr
- speech
- speech-recognition
- speech-to-text
- whisper
- 4modelscope/FunASR★ 15,973 · ⑂ 1,662
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
- conformer
- pytorch
- speech-recognition
- paraformer
- punctuation
- speaker-diarization
- 5PaddlePaddle/PaddleSpeech★ 12,596 · ⑂ 1,956
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
- transformer
- conformer
- speech-translation
- streaming-asr
- speech-alignment
- punctuation-restoration
- 6speechbrain/speechbrain★ 11,517 · ⑂ 1,686
A PyTorch-based Speech Toolkit
- speech-recognition
- speech-toolkit
- speaker-recognition
- speech-to-text
- speech-enhancement
- speech-separation
- 7espnet/espnet★ 9,828 · ⑂ 2,399
End-to-End Speech Processing Toolkit
- deep-learning
- end-to-end
- chainer
- pytorch
- kaldi
- speech-recognition
- 8abus-aikorea/voice-pro★ 9,150 · ⑂ 1,230
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
- faster-whisper
- tts
- whisper
- gradio
- subtitles
- transcription
- 9Uberi/speech_recognition★ 8,964 · ⑂ 2,428
Speech recognition module for Python, supporting several engines and APIs, online and offline.
- python
- audio
- speech-recognition
- speech-to-text
- 10nl8590687/ASRT_SpeechRecognition★ 8,372 · ⑂ 1,899
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
- tensorflow
- cnn
- ctc
- python
- keras
- speech-recognition
- 11FunAudioLLM/SenseVoice★ 8,096 · ⑂ 739
Multilingual Voice Understanding Model
- ai
- asr
- gpt-4o
- speech-recognition
- speech-to-text
- aigc
- 12Blaizzy/mlx-audio★ 6,953 · ⑂ 579
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
- apple-silicon
- audio-processing
- mlx
- multimodal
- speech-recognition
- speech-synthesis
- 13PaddlePaddle/PaddleX★ 6,128 · ⑂ 1,188
All-in-One Development Tool based on PaddlePaddle
- classification
- segmentation
- deployment
- ocr
- time-series
- pp-chatocr
- 14modelscope/FunClip★ 5,578 · ⑂ 688
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
- speech-recognition
- video-clip
- video-subtitles
- subtitles-generator
- speech-to-text
- gradio
- 15wenet-e2e/wenet★ 5,103 · ⑂ 1,182
Production First and Production Ready End-to-End Speech Recognition Toolkit
- e2e-models
- pytorch
- asr
- transformer
- conformer
- production-ready
- 16Picovoice/porcupine★ 4,808 · ⑂ 574
On-device wake word detection powered by deep learning
- wake-word-detection
- hotword
- keyword-spotting
- keyword-spotter
- wake-word
- wake-word-engine
- 17yanshengjia/ml-road★ 4,754 · ⑂ 1,701
Machine Learning and Agentic AI Resources, Practice and Research
- machine-learning
- deep-learning
- nlp
- computer-vision
- speech-recognition
- tensorflow
- 18jianchang512/stt★ 4,512 · ⑂ 483
Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式
- speech
- speech-recognition
- speech-to-text
- stt
- 19huggingface/distil-whisper★ 4,081 · ⑂ 355
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
- audio
- speech-recognition
- whisper
- 20ahmetoner/whisper-asr-webservice★ 3,254 · ⑂ 574
OpenAI Whisper ASR Webservice API
- automatic-speech-recognition
- speech-recognition
- speech-to-text
- openai-whisper
- docker
- asr
- 21chenyme/Chenyme-AAVT★ 3,043 · ⑂ 242
这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。
- faster-whisper
- gpt-4
- speech-recognition
- video-translation
- whisper
- gpt-4o
- 22tensorflow/lingvo★ 2,863 · ⑂ 450
Lingvo
- speech-recognition
- translation
- speech-to-text
- machine-translation
- mnist
- seq2seq
- 23zzw922cn/Automatic_Speech_Recognition★ 2,838 · ⑂ 536
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
- automatic-speech-recognition
- tensorflow
- timit-dataset
- feature-vector
- phonemes
- data-preprocessing
- 24linto-ai/whisper-timestamped★ 2,812 · ⑂ 211
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
- deep-learning
- speech
- speech-recognition
- speech-to-text
- asr
- machine-learning
- 25mravanelli/pytorch-kaldi★ 2,398 · ⑂ 444
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
- speech-recognition
- gru
- dnn
- kaldi
- rnn-model
- pytorch
Find Python engineers shipping Speech recognition
The list above ranks the most-starred public Python repositories tagged with the Speech recognition topic, drawn from the public GitHub graph. Across 444 matching repositories, the contributors are a tight cluster of engineers with both Python chops and real Speech recognition experience.
That overlap is rare. Most Python engineers haven’t shipped Speech recognition, and most Speech recognition maintainers don’t write Python. The people on this list’s contributor graph are the ones who do both.
Refolk turns this list into a search. Ask for “Python Speech recognition maintainers hiring” or “Python engineers shipping Speech recognition in 2025” and Refolk returns a ranked shortlist with the commits, profiles, and projects behind each name.
How this list is built
Last refreshed: Thu, 07 May 2026 06:52:13 GMT
Need a more specific search?
Refolk runs natural-language searches across GitHub, LinkedIn, and the open web. Try one of these:
Related lists
- Python · Machine learning
- Python · Deep learning
- Python · Computer vision
- Python · Natural language processing
- Python · LLM
- Python · AI agents
- Python · RAG
- Python · Embeddings
See all repository lists.