Refolk

Top RAG repositories on GitHub

Retrieval-augmented generation pipelines, embeddings, and grounding tooling.

Ranked by stars across 1,133 repositories tagged rag. Refreshed daily.

  1. 1
    langgenius/dify140,399 · ⑂ 22,018

    Production-ready platform for agentic workflow development.

    • ai
    • gpt
    • llm
    • openai
    • python
    • rag
  2. 2
    langchain-ai/langchain135,982 · ⑂ 22,480

    The agent engineering platform. Available in TypeScript!

    • ai
    • anthropic
    • gemini
    • langchain
    • llm
    • openai
  3. 3
    open-webui/open-webui135,830 · ⑂ 19,340

    User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

    • ollama
    • ollama-webui
    • llm
    • webui
    • self-hosted
    • llm-ui
  4. 4
    Shubhamsaboo/awesome-llm-apps109,087 · ⑂ 16,139

    100+ AI Agent & RAG apps you can actually run — clone, customize, ship.

    • llms
    • rag
    • python
    • agents
  5. 5
    infiniflow/ragflow79,858 · ⑂ 9,089

    RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

    • ai
    • ai-agents
    • context-engine
    • llm-apps
    • rag
    • retrieval-augmented-generation
  6. 6
    PaddlePaddle/PaddleOCR77,196 · ⑂ 10,373

    Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

    • ocr
    • chineseocr
    • pdf2markdown
    • pp-ocr
    • pp-structure
    • document-parsing
  7. 7
    dair-ai/Prompt-Engineering-Guide74,277 · ⑂ 8,030

    🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.

    • deep-learning
    • prompt-engineering
    • openai
    • chatgpt
    • language-model
    • generative-ai
  8. 8
    thedotmack/claude-mem73,048 · ⑂ 6,271

    A Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future sessions.

    • ai
    • ai-agents
    • ai-memory
    • anthropic
    • artificial-intelligence
    • claude
  9. 9
    pathwaycom/llm-app59,826 · ⑂ 1,436

    Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.

    • chatbot
    • hugging-face
    • llm
    • llm-local
    • llm-prompting
    • llm-security
  10. 10
    Mintplex-Labs/anything-llm59,643 · ⑂ 6,448

    The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configuration.

    • rag
    • lmstudio
    • localai
    • vector-database
    • ollama
    • local-llm
  11. 11
    mem0ai/mem054,963 · ⑂ 6,226

    Universal memory layer for AI Agents

    • ai
    • chatgpt
    • llm
    • python
    • chatbots
    • rag
  12. 12
    FlowiseAI/Flowise52,614 · ⑂ 24,279

    Build AI Agents, Visually

    • artificial-intelligence
    • chatgpt
    • large-language-models
    • low-code
    • no-code
    • javascript
  13. 13
    run-llama/llama_index49,181 · ⑂ 7,365

    LlamaIndex is the leading document agent and OCR platform

    • agents
    • application
    • data
    • fine-tuning
    • framework
    • llamaindex
  14. 14
    jeecgboot/JeecgBoot46,100 · ⑂ 15,968

    AI低代码平台,支持「低代码 + 零代码」双模式:零代码 5 分钟搭建业务系统,低代码模式一键生成前后端代码。 内置AI 应用,支持AI聊天、知识库、流程编排、MCP与插件,支持各种模型。Skills能力实现:一句话画流程图、设计表单、生成系统。 引领 AI生成→在线配置→代码生成→手工合并的开发模式,解决Java项目80%的重复工作,快速提高效率,又不失灵活性。

    • antd
    • activiti
    • codegenerator
    • springcloud
    • springboot
    • low-code
  15. 15
    milvus-io/milvus44,154 · ⑂ 3,988

    Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

    • anns
    • nearest-neighbor-search
    • faiss
    • vector-search
    • image-search
    • hnsw
  16. 16
    safishamsi/graphify44,006 · ⑂ 4,794

    AI coding assistant skill (Claude Code, Codex, OpenCode, Cursor, Gemini CLI, and more). Turn any folder of code, SQL schemas, R scripts, shell scripts, docs, papers, images, or videos into a queryable knowledge graph. App code + database schema + infrastructure in one graph.

    • claude-code
    • graphrag
    • knowledge-graph
    • codex
    • openclaw
    • skills
  17. 17
    datawhalechina/hello-agents43,201 · ⑂ 5,252

    📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程

    • agent
    • tutorial
    • llm
    • rag
  18. 18
    QuivrHQ/quivr39,134 · ⑂ 3,752

    Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: PGVector, Faiss. Any Files. Anyway you want.

    • ai
    • llm
    • api
    • chatbot
    • chatgpt
    • database
  19. 19
    mindsdb/mindsdb39,122 · ⑂ 6,199

    AI Data Vault - A query engine for AI Agents to securely query data from any datasource

    • ai
    • artificial-inteligence
    • databases
    • llms
    • rag
    • agents
  20. 20
    chatchat-space/Langchain-Chatchat37,967 · ⑂ 6,197

    Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain

    • chatglm
    • langchain
    • llm
    • knowledge-base
    • llama
    • chatbot
  21. 21
    HKUDS/LightRAG34,834 · ⑂ 4,934

    [EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"

    • knowledge-graph
    • large-language-models
    • retrieval-augmented-generation
    • genai
    • graphrag
    • llm
  22. 22
    patchy631/ai-engineering-hub34,717 · ⑂ 5,751

    In-depth tutorials on LLMs, RAGs and real-world AI agent applications.

    • agents
    • ai
    • llms
    • machine-learning
    • mcp
    • rag
  23. 23
    khoj-ai/khoj34,417 · ⑂ 2,185

    Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

    • semantic-search
    • emacs
    • obsidian-md
    • chat
    • chatgpt
    • ai
  24. 24
    ZhuLinsen/daily_stock_analysis34,311 · ⑂ 33,993

    LLM驱动的 A/H/美股智能分析器:多数据源行情 + 实时新闻 + LLM决策仪表盘 + 多渠道推送,零成本定时运行,纯白嫖. LLM-powered stock analysis system for A/H/US markets.

    • ai
    • aigc
    • gemini
    • llm
    • quant
    • stock
  25. 25
    ItzCrazyKns/Vane34,170 · ⑂ 3,728

    Vane is an AI-powered answering engine.

    • ai-search-engine
    • search-engine
    • open-source-ai-search-engine
    • perplexica
    • artificial-intelligence
    • machine-learning

Find engineers shipping RAG

The list above ranks the most-starred public repositories tagged with the RAG topic, drawn from the public GitHub graph. Across 1,133 repositories tagged this way, the maintainers and top contributors are a tight cluster of the people actually building RAG.

Looking for engineers who’ve worked on RAG for real, not just listed it on LinkedIn? The fastest path is the contributor list of these repos. Their commits, issues, and READMEs are public proof of depth.

Refolk turns this list into a search. Ask for “maintainers of top RAG repos who are hiring”, RAG engineers in San Francisco”, or “founders shipping RAG” and Refolk returns a ranked shortlist with sources.

How this list is built

Refolk searched GitHub for public repositories tagged with the RAG topic, ranked them by stargazer count, and kept those with at least 50 stars. The list refreshes once a day.

Last refreshed: Thu, 07 May 2026 05:55:05 GMT

Need a list like this for any search?

Refolk runs natural-language searches across GitHub, LinkedIn, and the open web. Try one of these:

Browse other topics

See all repository lists.

RAG by language