Top Natural language processing repositories on GitHub
Tokenizers, classical NLP, and modern language model tooling.
Ranked by stars across 2,818 repositories tagged nlp. Refreshed daily.
- 1huggingface/transformers★ 160,327 · ⑂ 33,126
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
- nlp
- natural-language-processing
- pytorch
- pytorch-transformers
- transformer
- model-hub
- 2hiyouga/LlamaFactory★ 70,990 · ⑂ 8,673
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
- fine-tuning
- llama
- llm
- peft
- transformers
- rlhf
- 3microsoft/AI-For-Beginners★ 47,253 · ⑂ 9,730
12 Weeks, 24 Lessons, AI for All!
- deep-learning
- artificial-intelligence
- machine-learning
- ai
- computer-vision
- nlp
- 4apachecn/ailearning★ 42,235 · ⑂ 11,571
AiLearning:数据分析+机器学习实战+线性代数+PyTorch+NLTK+TF2
- fp-growth
- apriori
- mahchine-leaning
- naivebayes
- svm
- adaboost
- 5666ghj/BettaFish★ 40,776 · ⑂ 7,537
微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。
- agent-framework
- data-analysis
- multi-agent-system
- nlp
- public-opinion-analysis
- python3
- 6google-research/bert★ 40,001 · ⑂ 9,718
TensorFlow code and pre-trained models for BERT
- nlp
- natural-language-processing
- natural-language-understanding
- tensorflow
- 7google/langextract★ 36,394 · ⑂ 2,503
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
- llm
- nlp
- python
- gemini-ai
- information-extration
- large-language-models
- 8hankcs/HanLP★ 36,299 · ⑂ 10,903
中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
- nlp
- natural-language-processing
- hanlp
- pos-tagging
- dependency-parser
- text-classification
- 9explosion/spaCy★ 33,546 · ⑂ 4,679
💫 Industrial-strength Natural Language Processing (NLP) in Python
- natural-language-processing
- data-science
- machine-learning
- python
- cython
- nlp
- 10ashishpatel26/500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code★ 33,480 · ⑂ 7,119
500 AI Machine learning Deep learning Computer vision NLP Projects with code
- awesome
- machine-learning
- deep-learning
- machine-learning-projects
- deep-learning-project
- computer-vision-project
- 11stanford-oval/storm★ 28,162 · ⑂ 2,566
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
- large-language-models
- nlp
- knowledge-curation
- naacl
- report-generation
- retrieval-augmented-generation
- 12NirDiamant/RAG_Techniques★ 27,164 · ⑂ 3,267
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. Each technique has a detailed notebook tutorial.
- rag
- tutorials
- langchain
- llama-index
- llms
- python
- 13deepset-ai/haystack★ 25,102 · ⑂ 2,767
Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, memory, and generation. Built for scalable agents, RAG, multimodal applications, semantic search, and conversational systems.
- nlp
- question-answering
- pytorch
- semantic-search
- information-retrieval
- summarization
- 14lukasmasuch/best-of-ml-python★ 23,458 · ⑂ 3,114
🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.
- python
- machine-learning
- data-science
- nlp
- data-visualization
- tensorflow
- 15AiHubCN/Awesome-Chinese-LLM★ 22,559 · ⑂ 2,127
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
- llm
- nlp
- chatglm
- chinese
- llama
- awesome-lists
- 16microsoft/unilm★ 22,116 · ⑂ 2,698
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
- nlp
- pre-trained-model
- unilm
- minilm
- layoutlm
- layoutxlm
- 17huggingface/datasets★ 21,492 · ⑂ 3,196
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
- nlp
- datasets
- pytorch
- tensorflow
- pandas
- numpy
- 18RasaHQ/rasa★ 21,153 · ⑂ 4,910
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
- nlp
- machine-learning
- machine-learning-library
- bot
- bots
- botkit
- 19AccumulateMore/CV★ 20,727 · ⑂ 2,371
✅(已完结)超级全面的 深度学习 笔记【土堆 Pytorch】【李沐 动手学深度学习】【吴恩达 深度学习】【大飞 大模型Agent】
- book
- chinese
- computer-vision
- deep-learning
- machine-learning
- natural-language-processing
- 20AI4Finance-Foundation/FinGPT★ 19,957 · ⑂ 2,835
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
- chatgpt
- finance
- fintech
- large-language-models
- machine-learning
- nlp
- 21ymcui/Chinese-LLaMA-Alpaca★ 18,945 · ⑂ 1,855
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
- llm
- plm
- pre-trained-language-models
- alpaca
- llama
- nlp
- 22keon/awesome-nlp★ 18,496 · ⑂ 2,795
:book: A curated list of resources dedicated to Natural Language Processing (NLP)
- natural-language-processing
- deep-learning
- machine-learning
- language
- awesome
- awesome-list
- 23NLP-LOVE/ML-NLP★ 17,656 · ⑂ 4,635
此项目是机器学习(Machine Learning)、深度学习(Deep Learning)、NLP面试中常考到的知识点和代码实现,也是作为一个算法工程师必会的理论基础知识。
- nlp
- machine-learning
- deep-learning
- 24dair-ai/ML-YouTube-Courses★ 17,195 · ⑂ 2,104
📺 Discover the latest machine learning / AI courses on YouTube.
- machine-learning
- deep-learning
- nlp
- natural-language-processing
- ai
- data-science
- 25bharathgs/Awesome-pytorch-list★ 16,494 · ⑂ 2,833
A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
- pytorch
- python
- machine-learning
- deep-learning
- tutorials
- papers
Find engineers shipping Natural language processing
The list above ranks the most-starred public repositories tagged with the Natural language processing topic, drawn from the public GitHub graph. Across 2,818 repositories tagged this way, the maintainers and top contributors are a tight cluster of the people actually building Natural language processing.
Looking for engineers who’ve worked on Natural language processing for real, not just listed it on LinkedIn? The fastest path is the contributor list of these repos. Their commits, issues, and READMEs are public proof of depth.
Refolk turns this list into a search. Ask for “maintainers of top Natural language processing repos who are hiring”, “Natural language processing engineers in San Francisco”, or “founders shipping Natural language processing” and Refolk returns a ranked shortlist with sources.
How this list is built
Last refreshed: Thu, 07 May 2026 05:55:14 GMT
Need a list like this for any search?
Refolk runs natural-language searches across GitHub, LinkedIn, and the open web. Try one of these:
Browse other topics
- Top Low-code repos
- Top Vue repos
- Top Observability repos
- Top Open source repos
- Top Kubernetes repos
- Top Deep learning repos
- Top Machine learning repos
- Top Docker repos
See all repository lists.