Article Issue #5186

Vector Database

What to know

Vector Database is a specialized database designed to persist embedding vectors and execute approximate nearest-neighbor (ANN) queries efficiently, enabling applications to find the most semantically similar items to a query vector in milliseconds across millions or billions of stored embeddings; Vector databases build an ANN index, commonly HNSW (Hierarchical Navigable Small World graphs) or IVF (Inverted File Index), over stored vectors; Choosing between a dedicated vector database and a general-purpose database with vector extension depends on scale and operational complexity

Wikiwalls Team Administrator

May 15, 2026 2 min read

« Back to Glossary Index

Vector Database is a specialized database designed to persist embedding vectors and execute approximate nearest-neighbor (ANN) queries efficiently, enabling applications to find the most semantically similar items to a query vector in milliseconds across millions or billions of stored embeddings. It is the persistence layer powering RAG, semantic search, and recommendation systems.

How it works

Vector databases build an ANN index, commonly HNSW (Hierarchical Navigable Small World graphs) or IVF (Inverted File Index), over stored vectors. At query time, the index allows the database to traverse a small subset of the search space rather than computing distances to every stored vector. Most vector databases also store metadata alongside vectors and support hybrid filtering by both semantic similarity and structured attributes.

Key facts

Dedicated databases: Pinecone, Weaviate, Qdrant, Milvus, and Chroma are purpose-built vector databases.
Extensions: pgvector for PostgreSQL and Elasticsearch’s kNN search add vector capabilities to existing databases.
Index types: HNSW provides high recall and low latency but uses significant memory; IVF is more memory-efficient but requires a training step.
Hybrid search: Most vector databases support combining ANN with metadata filters or BM25 keyword scoring.

For builders

Choosing between a dedicated vector database and a general-purpose database with vector extension depends on scale and operational complexity. For prototypes or smaller datasets under a few million vectors, pgvector in an existing PostgreSQL instance is often sufficient. At hundreds of millions of vectors or with strict latency SLAs, a purpose-built solution with tuned indexing and distributed sharding becomes necessary.

Sources

« Back to Definition Index

If this saved you an afternoon — and we will send the next one straight to your inbox.

Wikiwalls Team

Administrator · 41 published guides · Joined 2016

Welcome to wikiwalls

How it works

Key facts

For builders

Sources

More from WikiWalls

Cursor vs Copilot vs Cody vs Windsurf, after a 30-day production diary

The Cheapest Production-Grade LLM, ranked at constant output quality

Best Mini-PC for Homelab: Beelink, Minisforum, GMKtec Tested

Best AI Note Apps: Mem vs Reflect vs Tana vs Saner.ai

One careful fix in your inbox each Wednesday.