Refolk

Top RAG repositories on GitHub

Retrieval-augmented generation pipelines, embeddings, and grounding tooling.

Ranked by stars across 1,261 repositories tagged rag. Refreshed daily.

  1. 1
    langgenius/dify146,000 · ⑂ 22,960

    Production-ready platform for agentic workflow development.

    • ai
    • gpt
    • llm
    • openai
    • python
    • rag
  2. 2
    open-webui/open-webui142,458 · ⑂ 20,485

    User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

    • ollama
    • ollama-webui
    • llm
    • webui
    • self-hosted
    • llm-ui
  3. 3
    langchain-ai/langchain139,781 · ⑂ 23,182

    The agent engineering platform.

    • ai
    • anthropic
    • gemini
    • langchain
    • llm
    • openai
  4. 4
    Shubhamsaboo/awesome-llm-apps115,183 · ⑂ 17,106

    100+ AI Agent & RAG apps you can actually run — clone, customize, ship.

    • llms
    • rag
    • python
    • agents
  5. 5
    thedotmack/claude-mem83,453 · ⑂ 7,221

    Persistent Context Across Sessions for Every Agent – Captures everything your agent does during sessions, compresses it with AI, and injects relevant context back into future sessions. Works with Claude Code, OpenClaw, Codex, Gemini, Hermes, Copilot, OpenCode + More

    • ai
    • ai-agents
    • ai-memory
    • anthropic
    • artificial-intelligence
    • claude
  6. 6
    infiniflow/ragflow83,263 · ⑂ 9,637

    RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

    • ai
    • ai-agents
    • context-engine
    • llm-apps
    • rag
    • retrieval-augmented-generation
  7. 7
    PaddlePaddle/PaddleOCR83,156 · ⑂ 10,830

    Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

    • ocr
    • chineseocr
    • pdf2markdown
    • pp-ocr
    • pp-structure
    • document-parsing
  8. 8
    dair-ai/Prompt-Engineering-Guide75,798 · ⑂ 8,266

    🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.

    • deep-learning
    • prompt-engineering
    • openai
    • chatgpt
    • language-model
    • generative-ai
  9. 9
    safishamsi/graphify69,991 · ⑂ 7,030

    AI coding assistant skill (Claude Code, Codex, OpenCode, Cursor, Gemini CLI, and more). Turn any folder of code, SQL schemas, R scripts, shell scripts, docs, papers, images, or videos into a queryable knowledge graph. App code + database schema + infrastructure in one graph.

    • claude-code
    • graphrag
    • knowledge-graph
    • codex
    • openclaw
    • skills
  10. 10
    Mintplex-Labs/anything-llm61,869 · ⑂ 6,751

    Stop renting your intelligence. Own it with AnythingLLM. Everything you need for a powerful local-first agent experience

    • rag
    • localai
    • vector-database
    • llm
    • ai-agents
    • multimodal
  11. 11
    datawhalechina/hello-agents60,594 · ⑂ 7,467

    📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程

    • agent
    • tutorial
    • llm
    • rag
  12. 12
    pathwaycom/llm-app59,288 · ⑂ 1,433

    Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.

    • chatbot
    • hugging-face
    • llm
    • llm-local
    • llm-prompting
    • llm-security
  13. 13
    mem0ai/mem059,007 · ⑂ 6,808

    Universal memory layer for AI Agents

    • ai
    • chatgpt
    • llm
    • python
    • chatbots
    • rag
  14. 14
    FlowiseAI/Flowise53,852 · ⑂ 24,564

    Build AI Agents, Visually

    • artificial-intelligence
    • chatgpt
    • large-language-models
    • low-code
    • no-code
    • javascript
  15. 15
    run-llama/llama_index50,246 · ⑂ 7,597

    LlamaIndex is the leading document agent and OCR platform

    • agents
    • application
    • data
    • fine-tuning
    • framework
    • llamaindex
  16. 16
    jeecgboot/JeecgBoot46,809 · ⑂ 16,055

    AI 低代码平台「低代码 + 零代码」双驱动!低代码可一键生成前后端代码;零代码可 5 分钟搭建系统;AI Skills 一句话画流程、设计表单、生成整套系统。内置 AI聊天、知识库、流程编排、MCP插件等,兼容主流大模型。引领「AI 生成 → 在线配置 → 代码生成 → 手工合并->AI修改」开发模式,消除 Java 项目 80% 的重复工作,提效而不失灵活。

    • antd
    • activiti
    • codegenerator
    • springcloud
    • springboot
    • low-code
  17. 17
    milvus-io/milvus44,862 · ⑂ 4,077

    Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

    • anns
    • nearest-neighbor-search
    • faiss
    • vector-search
    • image-search
    • hnsw
  18. 18
    chopratejas/headroom42,381 · ⑂ 2,923

    Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

    • agent
    • ai
    • anthropic
    • compression
    • context-engineering
    • context-window
  19. 19
    mindsdb/minds39,318 · ⑂ 6,206

    General-purpose AI designed for knowledge workers — creators, strategists, and operators — and individuals seeking AI systems they can truly control to help them get work done, with full flexibility to extend and deploy anywhere (VPC, on-prem, or cloud).

    • ai
    • artificial-inteligence
    • databases
    • llms
    • rag
    • agents
  20. 20
    QuivrHQ/quivr39,163 · ⑂ 3,723

    Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: PGVector, Faiss. Any Files. Anyway you want.

    • ai
    • llm
    • api
    • chatbot
    • chatgpt
    • database
  21. 21
    chatchat-space/Langchain-Chatchat38,200 · ⑂ 6,215

    Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain

    • chatglm
    • langchain
    • llm
    • knowledge-base
    • llama
    • chatbot
  22. 22
    HKUDS/LightRAG36,816 · ⑂ 5,192

    [EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"

    • knowledge-graph
    • large-language-models
    • retrieval-augmented-generation
    • genai
    • graphrag
    • llm
  23. 23
    patchy631/ai-engineering-hub35,904 · ⑂ 5,957

    In-depth tutorials on LLMs, RAGs and real-world AI agent applications.

    • agents
    • ai
    • llms
    • machine-learning
    • mcp
    • rag
  24. 24
    ItzCrazyKns/Vane35,383 · ⑂ 3,899

    Vane is an AI-powered answering engine.

    • ai-search-engine
    • search-engine
    • open-source-ai-search-engine
    • perplexica
    • artificial-intelligence
    • machine-learning
  25. 25
    langchain-ai/langgraph35,318 · ⑂ 5,924

    Build resilient agents.

    • agents
    • ai
    • ai-agents
    • chatgpt
    • deepagents
    • enterprise

Find engineers shipping RAG

The list above ranks the most-starred public repositories tagged with the RAG topic, drawn from the public GitHub graph. Across 1,261 repositories tagged this way, the maintainers and top contributors are a tight cluster of the people actually building RAG.

Looking for engineers who’ve worked on RAG for real, not just listed it on LinkedIn? The fastest path is the contributor list of these repos. Their commits, issues, and READMEs are public proof of depth.

Refolk turns this list into a search. Ask for “maintainers of top RAG repos who are hiring”, RAG engineers in San Francisco”, or “founders shipping RAG” and Refolk returns a ranked shortlist with sources.

How this list is built

Refolk searched GitHub for public repositories tagged with the RAG topic, ranked them by stargazer count, and kept those with at least 50 stars. The list refreshes once a day.

Last refreshed: Sun, 21 Jun 2026 07:10:48 GMT

Need a list like this for any search?

Refolk runs natural-language searches across GitHub, LinkedIn, and the open web. Try one of these:

Browse other topics

See all repository lists.

RAG by language