Refolk

Top Natural language processing repositories on GitHub

Tokenizers, classical NLP, and modern language model tooling.

Ranked by stars across 2,818 repositories tagged nlp. Refreshed daily.

  1. 1
    huggingface/transformers160,327 · ⑂ 33,126

    🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

    • nlp
    • natural-language-processing
    • pytorch
    • pytorch-transformers
    • transformer
    • model-hub
  2. 2
    hiyouga/LlamaFactory70,990 · ⑂ 8,673

    Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

    • fine-tuning
    • llama
    • llm
    • peft
    • transformers
    • rlhf
  3. 3
    microsoft/AI-For-Beginners47,253 · ⑂ 9,730

    12 Weeks, 24 Lessons, AI for All!

    • deep-learning
    • artificial-intelligence
    • machine-learning
    • ai
    • computer-vision
    • nlp
  4. 4
    apachecn/ailearning42,235 · ⑂ 11,571

    AiLearning:数据分析+机器学习实战+线性代数+PyTorch+NLTK+TF2

    • fp-growth
    • apriori
    • mahchine-leaning
    • naivebayes
    • svm
    • adaboost
  5. 5
    666ghj/BettaFish40,776 · ⑂ 7,537

    微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。

    • agent-framework
    • data-analysis
    • multi-agent-system
    • nlp
    • public-opinion-analysis
    • python3
  6. 6
    google-research/bert40,001 · ⑂ 9,718

    TensorFlow code and pre-trained models for BERT

    • nlp
    • google
    • natural-language-processing
    • natural-language-understanding
    • tensorflow
  7. 7
    google/langextract36,394 · ⑂ 2,503

    A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

    • llm
    • nlp
    • python
    • gemini-ai
    • information-extration
    • large-language-models
  8. 8
    hankcs/HanLP36,299 · ⑂ 10,903

    中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

    • nlp
    • natural-language-processing
    • hanlp
    • pos-tagging
    • dependency-parser
    • text-classification
  9. 9
    explosion/spaCy33,546 · ⑂ 4,679

    💫 Industrial-strength Natural Language Processing (NLP) in Python

    • natural-language-processing
    • data-science
    • machine-learning
    • python
    • cython
    • nlp
  10. 10

    500 AI Machine learning Deep learning Computer vision NLP Projects with code

    • awesome
    • machine-learning
    • deep-learning
    • machine-learning-projects
    • deep-learning-project
    • computer-vision-project
  11. 11
    stanford-oval/storm28,162 · ⑂ 2,566

    An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.

    • large-language-models
    • nlp
    • knowledge-curation
    • naacl
    • report-generation
    • retrieval-augmented-generation
  12. 12
    NirDiamant/RAG_Techniques27,164 · ⑂ 3,267

    This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. Each technique has a detailed notebook tutorial.

    • rag
    • tutorials
    • langchain
    • llama-index
    • llms
    • python
  13. 13
    deepset-ai/haystack25,102 · ⑂ 2,767

    Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, memory, and generation. Built for scalable agents, RAG, multimodal applications, semantic search, and conversational systems.

    • nlp
    • question-answering
    • pytorch
    • semantic-search
    • information-retrieval
    • summarization
  14. 14
    lukasmasuch/best-of-ml-python23,458 · ⑂ 3,114

    🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

    • python
    • machine-learning
    • data-science
    • nlp
    • data-visualization
    • tensorflow
  15. 15
    AiHubCN/Awesome-Chinese-LLM22,559 · ⑂ 2,127

    整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。

    • llm
    • nlp
    • chatglm
    • chinese
    • llama
    • awesome-lists
  16. 16
    microsoft/unilm22,116 · ⑂ 2,698

    Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

    • nlp
    • pre-trained-model
    • unilm
    • minilm
    • layoutlm
    • layoutxlm
  17. 17
    huggingface/datasets21,492 · ⑂ 3,196

    🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

    • nlp
    • datasets
    • pytorch
    • tensorflow
    • pandas
    • numpy
  18. 18
    RasaHQ/rasa21,153 · ⑂ 4,910

    💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

    • nlp
    • machine-learning
    • machine-learning-library
    • bot
    • bots
    • botkit
  19. 19
    AccumulateMore/CV20,727 · ⑂ 2,371

    ✅(已完结)超级全面的 深度学习 笔记【土堆 Pytorch】【李沐 动手学深度学习】【吴恩达 深度学习】【大飞 大模型Agent】

    • book
    • chinese
    • computer-vision
    • deep-learning
    • machine-learning
    • natural-language-processing
  20. 20
    AI4Finance-Foundation/FinGPT19,957 · ⑂ 2,835

    FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.

    • chatgpt
    • finance
    • fintech
    • large-language-models
    • machine-learning
    • nlp
  21. 21
    ymcui/Chinese-LLaMA-Alpaca18,945 · ⑂ 1,855

    中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

    • llm
    • plm
    • pre-trained-language-models
    • alpaca
    • llama
    • nlp
  22. 22
    keon/awesome-nlp18,496 · ⑂ 2,795

    :book: A curated list of resources dedicated to Natural Language Processing (NLP)

    • natural-language-processing
    • deep-learning
    • machine-learning
    • language
    • awesome
    • awesome-list
  23. 23
    NLP-LOVE/ML-NLP17,656 · ⑂ 4,635

    此项目是机器学习(Machine Learning)、深度学习(Deep Learning)、NLP面试中常考到的知识点和代码实现,也是作为一个算法工程师必会的理论基础知识。

    • nlp
    • machine-learning
    • deep-learning
  24. 24
    dair-ai/ML-YouTube-Courses17,195 · ⑂ 2,104

    📺 Discover the latest machine learning / AI courses on YouTube.

    • machine-learning
    • deep-learning
    • nlp
    • natural-language-processing
    • ai
    • data-science
  25. 25
    bharathgs/Awesome-pytorch-list16,494 · ⑂ 2,833

    A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.

    • pytorch
    • python
    • machine-learning
    • deep-learning
    • tutorials
    • papers

Find engineers shipping Natural language processing

The list above ranks the most-starred public repositories tagged with the Natural language processing topic, drawn from the public GitHub graph. Across 2,818 repositories tagged this way, the maintainers and top contributors are a tight cluster of the people actually building Natural language processing.

Looking for engineers who’ve worked on Natural language processing for real, not just listed it on LinkedIn? The fastest path is the contributor list of these repos. Their commits, issues, and READMEs are public proof of depth.

Refolk turns this list into a search. Ask for “maintainers of top Natural language processing repos who are hiring”, Natural language processing engineers in San Francisco”, or “founders shipping Natural language processing” and Refolk returns a ranked shortlist with sources.

How this list is built

Refolk searched GitHub for public repositories tagged with the Natural language processing topic, ranked them by stargazer count, and kept those with at least 50 stars. The list refreshes once a day.

Last refreshed: Thu, 07 May 2026 05:55:14 GMT

Need a list like this for any search?

Refolk runs natural-language searches across GitHub, LinkedIn, and the open web. Try one of these:

Browse other topics

See all repository lists.

Natural language processing by language