Top Python Natural language processing repositories on GitHub
Tokenizers, classical NLP, and modern language model tooling. Filtered to projects whose primary language is Python.
Ranked by stars across 2,276 Python repositories tagged nlp. Refreshed daily.
- 1huggingface/transformers★ 160,327 · ⑂ 33,126
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
- nlp
- natural-language-processing
- pytorch
- pytorch-transformers
- transformer
- model-hub
- 2hiyouga/LlamaFactory★ 70,990 · ⑂ 8,673
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
- fine-tuning
- llama
- llm
- peft
- transformers
- rlhf
- 3apachecn/ailearning★ 42,235 · ⑂ 11,571
AiLearning:数据分析+机器学习实战+线性代数+PyTorch+NLTK+TF2
- fp-growth
- apriori
- mahchine-leaning
- naivebayes
- svm
- adaboost
- 4666ghj/BettaFish★ 40,776 · ⑂ 7,537
微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。
- agent-framework
- data-analysis
- multi-agent-system
- nlp
- public-opinion-analysis
- python3
- 5google-research/bert★ 40,001 · ⑂ 9,718
TensorFlow code and pre-trained models for BERT
- nlp
- natural-language-processing
- natural-language-understanding
- tensorflow
- 6google/langextract★ 36,394 · ⑂ 2,503
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
- llm
- nlp
- python
- gemini-ai
- information-extration
- large-language-models
- 7hankcs/HanLP★ 36,299 · ⑂ 10,903
中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
- nlp
- natural-language-processing
- hanlp
- pos-tagging
- dependency-parser
- text-classification
- 8explosion/spaCy★ 33,546 · ⑂ 4,679
💫 Industrial-strength Natural Language Processing (NLP) in Python
- natural-language-processing
- data-science
- machine-learning
- python
- cython
- nlp
- 9stanford-oval/storm★ 28,162 · ⑂ 2,566
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
- large-language-models
- nlp
- knowledge-curation
- naacl
- report-generation
- retrieval-augmented-generation
- 10microsoft/unilm★ 22,116 · ⑂ 2,698
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
- nlp
- pre-trained-model
- unilm
- minilm
- layoutlm
- layoutxlm
- 11huggingface/datasets★ 21,492 · ⑂ 3,196
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
- nlp
- datasets
- pytorch
- tensorflow
- pandas
- numpy
- 12RasaHQ/rasa★ 21,153 · ⑂ 4,910
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
- nlp
- machine-learning
- machine-learning-library
- bot
- bots
- botkit
- 13ymcui/Chinese-LLaMA-Alpaca★ 18,945 · ⑂ 1,855
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
- llm
- plm
- pre-trained-language-models
- alpaca
- llama
- nlp
- 14piskvorky/gensim★ 16,408 · ⑂ 4,412
Topic Modelling for Humans
- gensim
- topic-modeling
- information-retrieval
- machine-learning
- natural-language-processing
- nlp
- 15
- 16flairNLP/flair★ 14,374 · ⑂ 2,115
A very simple framework for state-of-the-art Natural Language Processing (NLP)
- pytorch
- nlp
- named-entity-recognition
- sequence-labeling
- semantic-role-labeling
- word-embeddings
- 17
- 18PaddlePaddle/PaddleNLP★ 12,937 · ⑂ 3,044
Easy-to-use and powerful LLM and SLM library with awesome model zoo.
- nlp
- embedding
- bert
- ernie
- paddlenlp
- pretrained-models
- 19neuml/txtai★ 12,471 · ⑂ 808
💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows
- python
- search
- nlp
- semantic-search
- vector-search
- txtai
- 20allenai/allennlp★ 11,893 · ⑂ 2,223
An open-source NLP research library, built on PyTorch.
- pytorch
- nlp
- natural-language-processing
- deep-learning
- data-science
- python
- 21huggingface/text-generation-inference★ 10,851 · ⑂ 1,266
Large Language Model Text Generation Inference
- bloom
- nlp
- pytorch
- inference
- gpt
- deep-learning
- 22chiphuyen/stanford-tensorflow-tutorials★ 10,381 · ⑂ 4,258
This repository contains code examples for the Stanford's course: TensorFlow for Deep Learning Research.
- tensorflow
- deep-learning
- tutorial
- nlp
- natural-language-processing
- chatbot
- 23ymcui/Chinese-BERT-wwm★ 10,204 · ⑂ 1,390
Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
- chinese-bert
- tensorflow
- pytorch
- bert
- nlp
- roberta
- 24bigscience-workshop/petals★ 10,122 · ⑂ 607
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
- bloom
- deep-learning
- distributed-systems
- language-models
- large-language-models
- machine-learning
- 25jadore801120/attention-is-all-you-need-pytorch★ 9,714 · ⑂ 2,090
A PyTorch implementation of the Transformer model in "Attention is All You Need".
- attention
- deep-learning
- attention-is-all-you-need
- pytorch
- nlp
- natural-language-processing
Find Python engineers shipping Natural language processing
The list above ranks the most-starred public Python repositories tagged with the Natural language processing topic, drawn from the public GitHub graph. Across 2,276 matching repositories, the contributors are a tight cluster of engineers with both Python chops and real Natural language processing experience.
That overlap is rare. Most Python engineers haven’t shipped Natural language processing, and most Natural language processing maintainers don’t write Python. The people on this list’s contributor graph are the ones who do both.
Refolk turns this list into a search. Ask for “Python Natural language processing maintainers hiring” or “Python engineers shipping Natural language processing in 2025” and Refolk returns a ranked shortlist with the commits, profiles, and projects behind each name.
How this list is built
Last refreshed: Thu, 07 May 2026 05:54:07 GMT
Need a more specific search?
Refolk runs natural-language searches across GitHub, LinkedIn, and the open web. Try one of these:
Related lists
- Python · Machine learning
- Python · Deep learning
- Python · Computer vision
- Python · LLM
- Python · AI agents
- Python · RAG
- Python · Embeddings
- Python · Transformers
See all repository lists.