Refolk

Top Vector databases repositories on GitHub

Embedding stores and approximate nearest-neighbour search engines.

Ranked by stars across 302 repositories tagged vector-database. Refreshed daily.

  1. 1
    pathwaycom/llm-app59,826 · ⑂ 1,436

    Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.

    • chatbot
    • hugging-face
    • llm
    • llm-local
    • llm-prompting
    • llm-security
  2. 2
    Mintplex-Labs/anything-llm59,643 · ⑂ 6,448

    The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configuration.

    • rag
    • lmstudio
    • localai
    • vector-database
    • ollama
    • local-llm
  3. 3
    meilisearch/meilisearch57,435 · ⑂ 2,538

    A lightning-fast search engine API bringing AI-powered hybrid search to your sites and applications.

    • search-engine
    • typo-tolerance
    • site-search
    • database
    • enterprise-search
    • search
  4. 4
    run-llama/llama_index49,181 · ⑂ 7,365

    LlamaIndex is the leading document agent and OCR platform

    • agents
    • application
    • data
    • fine-tuning
    • framework
    • llamaindex
  5. 5
    milvus-io/milvus44,154 · ⑂ 3,988

    Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

    • anns
    • nearest-neighbor-search
    • faiss
    • vector-search
    • image-search
    • hnsw
  6. 6
    qdrant/qdrant31,097 · ⑂ 2,242

    Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

    • neural-network
    • search-engine
    • knn-algorithm
    • hnsw
    • vector-search
    • nearest-neighbor-search
  7. 7
    VectifyAI/PageIndex28,975 · ⑂ 2,462

    📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG

    • agentic-ai
    • agents
    • ai
    • ai-agents
    • context-engineering
    • llm
  8. 8
    NirDiamant/RAG_Techniques27,164 · ⑂ 3,267

    This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. Each technique has a detailed notebook tutorial.

    • rag
    • tutorials
    • langchain
    • llama-index
    • llms
    • python
  9. 9
    topoteretes/cognee17,073 · ⑂ 1,783

    Memory control plane for AI Agents in 6 lines of code

    • ai
    • cognitive-architecture
    • vector-database
    • openai
    • rag
    • ai-agents
  10. 10
    weaviate/weaviate16,141 · ⑂ 1,271

    Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​.

    • search-engine
    • semantic-search
    • semantic-search-engine
    • vector-search
    • vector-search-engine
    • vector-database
  11. 11
    memvid/memvid15,358 · ⑂ 1,320

    Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval and long-term memory.

    • ai
    • context
    • embedded
    • faiss
    • knowledge-base
    • knowledge-graph
  12. 12
    neuml/txtai12,471 · ⑂ 808

    💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows

    • python
    • search
    • nlp
    • semantic-search
    • vector-search
    • txtai
  13. 13
    langchain4j/langchain4j11,870 · ⑂ 2,200

    LangChain4j is an idiomatic, open-source Java library for building LLM-powered applications on the JVM. It offers a unified API over popular LLM providers and vector stores, and makes implementing tool calling (including MCP support), agents and RAG easy. It integrates seamlessly with enterprise Java frameworks like Quarkus and Spring Boot.

    • huggingface
    • java
    • langchain
    • openai
    • chatgpt
    • gpt
  14. 14
    yichuan-w/LEANN10,966 · ⑂ 962

    [MLsys2026]: RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

    • ai
    • faiss
    • langchain
    • llama-index
    • llm
    • localstorage
  15. 15
    zilliztech/claude-context10,819 · ⑂ 798

    Code search MCP for Claude Code. Make entire codebase the context for any coding agent.

    • agent
    • agentic-rag
    • ai-coding
    • code-search
    • cursor
    • embedding
  16. 16
    oramasearch/orama10,324 · ⑂ 387

    🌌 A complete search engine and RAG pipeline in your browser, server or edge network with support for full-text, vector, and hybrid search in less than 2kb.

    • data-structures
    • full-text
    • search
    • typo-tolerance
    • algiorithm
    • search-engine
  17. 17
    lancedb/lancedb10,211 · ⑂ 871

    Developer-friendly OSS embedded retrieval library for multimodal AI. Search More; Manage Less.

    • approximate-nearest-neighbor-search
    • image-search
    • nearest-neighbor-search
    • recommender-system
    • search-engine
    • semantic-search
  18. 18
    oceanbase/oceanbase10,091 · ⑂ 1,886

    The Fastest Distributed Database for Transactional, Analytical, and AI Workloads.

    • oceanbase
    • paxos
    • htap
    • cloud-native
    • mysql-compatibility
    • oltp
  19. 19
    alibaba/zvec9,563 · ⑂ 547

    A lightweight, lightning-fast, in-process vector database

    • rag
    • agent-skills
    • embedded
    • faiss
    • hnsw
    • llm-memory
  20. 20
    databendlabs/databend9,279 · ⑂ 869

    Data Agent Ready Warehouse : One for Analytics, Search, AI, Python Sandbox. — rebuilt from scratch. Unified architecture on your S3.

    • rust
    • database
    • serverless
    • bigdata
    • snowflake
    • ai
  21. 21
    activeloopai/deeplake9,114 · ⑂ 709

    Deeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling scalable retrieval and training.

    • deep-learning
    • pytorch
    • ai
    • mlops
    • computer-vision
    • datalake
  22. 22
    reorproject/reor8,557 · ⑂ 523

    Private & local AI personal knowledge management app for high entropy people.

    • ai
    • lancedb
    • llama
    • llamacpp
    • local-first
    • markdown
  23. 23
    zilliztech/deep-searcher7,812 · ⑂ 758

    Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.

    • agent
    • llm
    • rag
    • vector-database
    • deepseek
    • agentic-rag
  24. 24
    MariaDB/server7,547 · ⑂ 2,039

    MariaDB server is a community developed fork of MySQL server. Started by core members of the original MySQL team, MariaDB actively works with outside developers to deliver the most featureful, stable, and sanely licensed open SQL server in the industry.

    • amazon-web-services
    • database
    • fulltext-search
    • json
    • mariadb
    • mysql
  25. 25
    vespa-engine/vespa6,906 · ⑂ 710

    AI + Data, online. https://vespa.ai

    • vespa
    • search-engine
    • big-data
    • ai
    • serving-recommendation
    • machine-learning

Find engineers shipping Vector databases

The list above ranks the most-starred public repositories tagged with the Vector databases topic, drawn from the public GitHub graph. Across 302 repositories tagged this way, the maintainers and top contributors are a tight cluster of the people actually building Vector databases.

Looking for engineers who’ve worked on Vector databases for real, not just listed it on LinkedIn? The fastest path is the contributor list of these repos. Their commits, issues, and READMEs are public proof of depth.

Refolk turns this list into a search. Ask for “maintainers of top Vector databases repos who are hiring”, Vector databases engineers in San Francisco”, or “founders shipping Vector databases” and Refolk returns a ranked shortlist with sources.

How this list is built

Refolk searched GitHub for public repositories tagged with the Vector databases topic, ranked them by stargazer count, and kept those with at least 50 stars. The list refreshes once a day.

Last refreshed: Thu, 07 May 2026 05:54:12 GMT

Need a list like this for any search?

Refolk runs natural-language searches across GitHub, LinkedIn, and the open web. Try one of these:

Browse other topics

See all repository lists.

Vector databases by language