Top Vector databases repositories on GitHub

Embedding stores and approximate nearest-neighbour search engines.

Ranked by stars across 321 repositories tagged vector-database. Refreshed daily.

1
Mintplex-Labs/anything-llm★ 61,869 · ⑂ 6,751
Stop renting your intelligence. Own it with AnythingLLM. Everything you need for a powerful local-first agent experience
- rag
- localai
- vector-database
- llm
- ai-agents
- multimodal
2
pathwaycom/llm-app★ 59,288 · ⑂ 1,433
Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.
- chatbot
- hugging-face
- llm
- llm-local
- llm-prompting
- llm-security
3
meilisearch/meilisearch★ 58,210 · ⑂ 2,576
A lightning-fast search engine API bringing AI-powered hybrid search to your sites and applications.
- search-engine
- typo-tolerance
- site-search
- database
- enterprise-search
- search
4
run-llama/llama_index★ 50,246 · ⑂ 7,597
LlamaIndex is the leading document agent and OCR platform
- agents
- application
- data
- fine-tuning
- framework
- llamaindex
5
milvus-io/milvus★ 44,862 · ⑂ 4,077
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
- anns
- nearest-neighbor-search
- faiss
- vector-search
- image-search
- hnsw
6
VectifyAI/PageIndex★ 33,257 · ⑂ 2,894
📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG
- agentic-ai
- agents
- ai
- ai-agents
- context-engineering
- llm
7
qdrant/qdrant★ 32,502 · ⑂ 2,407
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
- neural-network
- search-engine
- knn-algorithm
- hnsw
- vector-search
- nearest-neighbor-search
8
NirDiamant/RAG_Techniques★ 28,079 · ⑂ 3,403
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. Each technique has a detailed notebook tutorial.
- rag
- tutorials
- langchain
- llama-index
- llms
- python
9
topoteretes/cognee★ 18,338 · ⑂ 1,951
Cognee is the open-source AI memory platform for agents. Give your AI agents persistent long-term memory across sessions with a self-hosted knowledge graph engine.
- ai
- cognitive-architecture
- vector-database
- ai-agents
- graph-database
- ai-memory
10
weaviate/weaviate★ 16,384 · ⑂ 1,314
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.
- search-engine
- semantic-search
- semantic-search-engine
- vector-search
- vector-search-engine
- vector-database
11
memvid/memvid★ 15,675 · ⑂ 1,355
Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval and long-term memory.
- ai
- context
- embedded
- faiss
- knowledge-base
- knowledge-graph
12
neuml/txtai★ 12,673 · ⑂ 835
💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows
- python
- search
- nlp
- semantic-search
- vector-search
- txtai
13
StarTrail-org/LEANN★ 12,456 · ⑂ 1,119
[MLsys2026]: RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.
- ai
- faiss
- langchain
- llama-index
- llm
- localstorage
14
langchain4j/langchain4j★ 12,377 · ⑂ 2,319
LangChain4j is an idiomatic, open-source Java library for building LLM-powered applications on the JVM. It offers a unified API over popular LLM providers and vector stores, and makes implementing tool calling (including MCP support), agents and RAG easy. It integrates seamlessly with enterprise Java frameworks like Quarkus and Spring Boot.
- huggingface
- java
- langchain
- openai
- chatgpt
- gpt
15
zilliztech/claude-context★ 11,908 · ⑂ 881
Code search MCP for Claude Code. Make entire codebase the context for any coding agent.
- agent
- agentic-rag
- ai-coding
- code-search
- cursor
- embedding
16
alibaba/zvec★ 11,896 · ⑂ 703
A lightweight, lightning-fast, in-process vector database
- rag
- agent-skills
- embedded
- faiss
- hnsw
- llm-memory
17
lancedb/lancedb★ 10,666 · ⑂ 919
Developer-friendly OSS embedded retrieval library for multimodal AI. Search More; Manage Less.
- approximate-nearest-neighbor-search
- image-search
- nearest-neighbor-search
- recommender-system
- search-engine
- semantic-search
18
oramasearch/orama★ 10,422 · ⑂ 393
🌌 A complete search engine and RAG pipeline in your browser, server or edge network with support for full-text, vector, and hybrid search in less than 2kb.
- data-structures
- full-text
- search
- typo-tolerance
- algiorithm
- search-engine
19
oceanbase/oceanbase★ 10,161 · ⑂ 1,903
The Fastest Distributed Database for Transactional, Analytical, and AI Workloads.
- oceanbase
- paxos
- htap
- cloud-native
- mysql-compatibility
- oltp
20
databendlabs/databend★ 9,346 · ⑂ 886
Data Agent Ready Warehouse : One for Analytics, Search, AI, Python Sandbox. — rebuilt from scratch. Unified architecture on your S3.
- rust
- database
- serverless
- bigdata
- snowflake
- ai
21
activeloopai/deeplake★ 9,183 · ⑂ 717
Deeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling scalable retrieval and training.
- deep-learning
- pytorch
- ai
- mlops
- computer-vision
- datalake
22
reorproject/reor★ 8,561 · ⑂ 527
Private & local AI personal knowledge management app for high entropy people.
- ai
- lancedb
- llama
- llamacpp
- local-first
- markdown
23
zilliztech/deep-searcher★ 7,898 · ⑂ 763
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
- agent
- llm
- rag
- vector-database
- deepseek
- agentic-rag
24
MariaDB/server★ 7,758 · ⑂ 2,066
MariaDB server is a community developed fork of MySQL server. Started by core members of the original MySQL team, MariaDB actively works with outside developers to deliver the most featureful, stable, and sanely licensed open SQL server in the industry.
- amazon-web-services
- database
- fulltext-search
- json
- mariadb
- mysql
25
vespa-engine/vespa★ 6,967 · ⑂ 720
The AI search platform
- vespa
- search-engine
- big-data
- ai
- serving-recommendation
- machine-learning

Find engineers shipping Vector databases

The list above ranks the most-starred public repositories tagged with the Vector databases topic, drawn from the public GitHub graph. Across 321 repositories tagged this way, the maintainers and top contributors are a tight cluster of the people actually building Vector databases.

Looking for engineers who’ve worked on Vector databases for real, not just listed it on LinkedIn? The fastest path is the contributor list of these repos. Their commits, issues, and READMEs are public proof of depth.

Refolk turns this list into a search. Ask for “maintainers of top Vector databases repos who are hiring”, “Vector databases engineers in San Francisco”, or “founders shipping Vector databases” and Refolk returns a ranked shortlist with sources.

How this list is built

Refolk searched GitHub for public repositories tagged with the Vector databases topic, ranked them by stargazer count, and kept those with at least 50 stars. The list refreshes once a day.

Last refreshed: Sun, 21 Jun 2026 07:07:38 GMT

Need a list like this for any search?

Refolk runs natural-language searches across GitHub, LinkedIn, and the open web. Try one of these:

Browse other topics

See all repository lists.

Vector databases by language

Rust · Vector databases