Top Vector databases repositories on GitHub
Embedding stores and approximate nearest-neighbour search engines.
Ranked by stars across 302 repositories tagged vector-database. Refreshed daily.
- 1pathwaycom/llm-app★ 59,826 · ⑂ 1,436
Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.
- chatbot
- hugging-face
- llm
- llm-local
- llm-prompting
- llm-security
- 2Mintplex-Labs/anything-llm★ 59,643 · ⑂ 6,448
The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configuration.
- rag
- lmstudio
- localai
- vector-database
- ollama
- local-llm
- 3meilisearch/meilisearch★ 57,435 · ⑂ 2,538
A lightning-fast search engine API bringing AI-powered hybrid search to your sites and applications.
- search-engine
- typo-tolerance
- site-search
- database
- enterprise-search
- search
- 4run-llama/llama_index★ 49,181 · ⑂ 7,365
LlamaIndex is the leading document agent and OCR platform
- agents
- application
- data
- fine-tuning
- framework
- llamaindex
- 5milvus-io/milvus★ 44,154 · ⑂ 3,988
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
- anns
- nearest-neighbor-search
- faiss
- vector-search
- image-search
- hnsw
- 6qdrant/qdrant★ 31,097 · ⑂ 2,242
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
- neural-network
- search-engine
- knn-algorithm
- hnsw
- vector-search
- nearest-neighbor-search
- 7VectifyAI/PageIndex★ 28,975 · ⑂ 2,462
📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG
- agentic-ai
- agents
- ai
- ai-agents
- context-engineering
- llm
- 8NirDiamant/RAG_Techniques★ 27,164 · ⑂ 3,267
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. Each technique has a detailed notebook tutorial.
- rag
- tutorials
- langchain
- llama-index
- llms
- python
- 9topoteretes/cognee★ 17,073 · ⑂ 1,783
Memory control plane for AI Agents in 6 lines of code
- ai
- cognitive-architecture
- vector-database
- openai
- rag
- ai-agents
- 10weaviate/weaviate★ 16,141 · ⑂ 1,271
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.
- search-engine
- semantic-search
- semantic-search-engine
- vector-search
- vector-search-engine
- vector-database
- 11memvid/memvid★ 15,358 · ⑂ 1,320
Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval and long-term memory.
- ai
- context
- embedded
- faiss
- knowledge-base
- knowledge-graph
- 12neuml/txtai★ 12,471 · ⑂ 808
💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows
- python
- search
- nlp
- semantic-search
- vector-search
- txtai
- 13langchain4j/langchain4j★ 11,870 · ⑂ 2,200
LangChain4j is an idiomatic, open-source Java library for building LLM-powered applications on the JVM. It offers a unified API over popular LLM providers and vector stores, and makes implementing tool calling (including MCP support), agents and RAG easy. It integrates seamlessly with enterprise Java frameworks like Quarkus and Spring Boot.
- huggingface
- java
- langchain
- openai
- chatgpt
- gpt
- 14yichuan-w/LEANN★ 10,966 · ⑂ 962
[MLsys2026]: RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.
- ai
- faiss
- langchain
- llama-index
- llm
- localstorage
- 15zilliztech/claude-context★ 10,819 · ⑂ 798
Code search MCP for Claude Code. Make entire codebase the context for any coding agent.
- agent
- agentic-rag
- ai-coding
- code-search
- cursor
- embedding
- 16oramasearch/orama★ 10,324 · ⑂ 387
🌌 A complete search engine and RAG pipeline in your browser, server or edge network with support for full-text, vector, and hybrid search in less than 2kb.
- data-structures
- full-text
- search
- typo-tolerance
- algiorithm
- search-engine
- 17lancedb/lancedb★ 10,211 · ⑂ 871
Developer-friendly OSS embedded retrieval library for multimodal AI. Search More; Manage Less.
- approximate-nearest-neighbor-search
- image-search
- nearest-neighbor-search
- recommender-system
- search-engine
- semantic-search
- 18oceanbase/oceanbase★ 10,091 · ⑂ 1,886
The Fastest Distributed Database for Transactional, Analytical, and AI Workloads.
- oceanbase
- paxos
- htap
- cloud-native
- mysql-compatibility
- oltp
- 19alibaba/zvec★ 9,563 · ⑂ 547
A lightweight, lightning-fast, in-process vector database
- rag
- agent-skills
- embedded
- faiss
- hnsw
- llm-memory
- 20databendlabs/databend★ 9,279 · ⑂ 869
Data Agent Ready Warehouse : One for Analytics, Search, AI, Python Sandbox. — rebuilt from scratch. Unified architecture on your S3.
- rust
- database
- serverless
- bigdata
- snowflake
- ai
- 21activeloopai/deeplake★ 9,114 · ⑂ 709
Deeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling scalable retrieval and training.
- deep-learning
- pytorch
- ai
- mlops
- computer-vision
- datalake
- 22reorproject/reor★ 8,557 · ⑂ 523
Private & local AI personal knowledge management app for high entropy people.
- ai
- lancedb
- llama
- llamacpp
- local-first
- markdown
- 23zilliztech/deep-searcher★ 7,812 · ⑂ 758
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
- agent
- llm
- rag
- vector-database
- deepseek
- agentic-rag
- 24MariaDB/server★ 7,547 · ⑂ 2,039
MariaDB server is a community developed fork of MySQL server. Started by core members of the original MySQL team, MariaDB actively works with outside developers to deliver the most featureful, stable, and sanely licensed open SQL server in the industry.
- amazon-web-services
- database
- fulltext-search
- json
- mariadb
- mysql
- 25vespa-engine/vespa★ 6,906 · ⑂ 710
AI + Data, online. https://vespa.ai
- vespa
- search-engine
- big-data
- ai
- serving-recommendation
- machine-learning
Find engineers shipping Vector databases
The list above ranks the most-starred public repositories tagged with the Vector databases topic, drawn from the public GitHub graph. Across 302 repositories tagged this way, the maintainers and top contributors are a tight cluster of the people actually building Vector databases.
Looking for engineers who’ve worked on Vector databases for real, not just listed it on LinkedIn? The fastest path is the contributor list of these repos. Their commits, issues, and READMEs are public proof of depth.
Refolk turns this list into a search. Ask for “maintainers of top Vector databases repos who are hiring”, “Vector databases engineers in San Francisco”, or “founders shipping Vector databases” and Refolk returns a ranked shortlist with sources.
How this list is built
Last refreshed: Thu, 07 May 2026 05:54:12 GMT
Need a list like this for any search?
Refolk runs natural-language searches across GitHub, LinkedIn, and the open web. Try one of these:
Browse other topics
- Top Compilers repos
- Top Data engineering repos
- Top Transformers repos
- Top Embeddings repos
- Top Machine learning repos
- Top Fine-tuning repos
- Top DevOps repos
- Top Docker repos
See all repository lists.