Files
knowledge-base/VECTOR-DBS.en.md
Stanislav Hubacek ef3c2f75b1 18.6.2026
2026-06-18 16:25:33 +02:00

5.2 KiB

🧠 Vector Databases

Overview

Specialized databases for storing and searching embeddings — vector representations of unstructured data (text, images, audio, video). They enable semantic search based on similarity, not exact matching. A key building block for RAG (Retrieval-Augmented Generation) and AI applications.

Embeddings

  • Map unstructured data into a vector space (list of numbers)
  • Proximity in vector space = semantic similarity
  • Generated by models: Word2Vec, BERT, OpenAI embeddings, E5, Cohere, Mistral
  • Dimensions: 384 (all-MiniLM) to 3072 (OpenAI text-embedding-3-large)

Vector indexing

Method Algorithm Description Accuracy Speed
Flat (brute-force) Full scan Comparison with all vectors 100% O(N) — slow for > 100K
IVF (Inverted File) K-means clustering Partition into clusters, search nearest cluster ~95-99% O(sqrt(N))
HNSW (Hierarchical Navigable Small World) Navigable graph Multi-level graph, greedy search ~99-100% O(log N)
IVF-PQ IVF + Product Quantization Vector compression, less memory ~90-95% O(sqrt(N))
DiskANN SSD-based graph Vectors on disk, Vamana graph ~95-98% O(log N) + I/O

Index selection

Number of vectors Requirement Recommended index
< 100K 100% accuracy Flat
100K - 10M High accuracy, speed HNSW
10M+ Memory efficiency IVF-PQ, DiskANN
100M+ Scaling on SSD DiskANN

Use case: RAG (Retrieval-Augmented Generation)

User query → Embedding model → Vector DB search → Relevant chunks → LLM → Answer

Variants:

  • Naive RAG — single retrieval + single generation
  • Advanced RAG — pre-retrieval (query rewriting, HyDE) + post-retrieval (reranking, filtering)
  • Multi-modal RAG — text + images + audio in one pipeline

Tools — comparison

Tool Type Indexes Cloud Self-hosted Note
Pinecone Managed HNSW, IVF-PQ Yes No Fully managed, no ops. Pricing by dimension and vector count
Weaviate Open source HNSW, Flat Yes (WCD) Yes Graph + vector, hybrid queries, modular (generative search)
Qdrant Open source HNSW, IVF-PQ, quantization Yes (Cloud) Yes Rust, batch API, filter concurrent with vector search
Milvus Open source IVF, HNSW, IVF-PQ, DiskANN Yes (Zilliz) Yes GPU acceleration. More complex ops (K8s required)
pgvector PostgreSQL extension IVFFlat, HNSW All (via RDS) Yes Embeddings directly in PostgreSQL. Hybrid SQL + vectors
Chroma Open source HNSW No Yes Simple embedding + retrieval, Python-native
LanceDB Open source IVF-PQ No Yes Multi-modal data, Arrow format, no server (embedded)
Elasticsearch Search engine HNSW (8.0+) Yes (Cloud) Yes If you already have ES, can use for vectors too

pgvector vs standalone vector DB

Feature pgvector Standalone (Pinecone, Qdrant, Milvus)
Architecture Extension in PostgreSQL Standalone service
Hybrid queries Native SQL + vectors Requires coordination of two systems
Latency Higher (disk-based PG) Lower (in-memory indexes)
Scaling PG replication / Citus Native sharding, rebalancing
Consistency PG ACID transactions Eventual consistency
Operations One system Two systems (operational overhead)

Recommendations — Tool selection

Scenario Recommendation Rationale
RAG on PostgreSQL data pgvector Hybrid SQL + vectors in one DB
RAG production, no ops Pinecone Fully managed, scalable, no operations
Self-hosted RAG Qdrant (simpler) / Milvus (performance) Open source, data control
Full-text + vectors Elasticsearch / Weaviate Combination of BM25 + vector score
Research / prototyping Chroma Python-native, quick start
Embedded / edge LanceDB No server, Arrow format
Multi-modal data Weaviate / LanceDB Native image, audio, video support
GPU acceleration Milvus CUDA support for index build

When to (not) use a vector DB

Use when:

  • You need semantic search (similarity by meaning, not keywords)
  • You are building a RAG / AI assistant over your own data
  • Document/image deduplication (near-duplicate detection)
  • Recommendation systems (similar content, similar users)

Do not use when:

  • You need exact matching (keys, IDs, foreign keys) → SQL
  • Full-text search suffices (BM25, stemming) → Elasticsearch, PostgreSQL full-text
  • Vectors are just a complement to the primary DB → pgvector (simplicity)
  • Fewer than 1000 documents → brute-force in application is sufficient

Sources

References, books, and standards: sources/databases/sources.en.md

Book Authors Description
Vector Databases Borwankar (2026) Comprehensive guide to vector DBs from concepts to production deployment

Last revision: 2026-06-03