Back to roadmaps pgvector Course

Vector Indexes: Comparing IVFFlat and HNSW

Without indexes, PostgreSQL must perform a sequential scan on your table, calculating vector distance for every single row. To speed up similarity queries on large datasets, you must create vector indexes. pgvector supports two indexing algorithms: IVFFlat and HNSW.


1. IVFFlat (Inverted File Flat)

IVFFlat partitions your vector database table into clusters using k-means clustering. When you execute a search query, it checks only the most relevant clusters:

Key Characteristics

  • Build Speed: Fast to generate.
  • Memory Footprint: Low memory consumption.
  • Accuracy vs Speed: High speed improvements but can occasionally miss nearest neighbors if the search scan parameters are set too low.
  • Requirement: Requires the table to be pre-populated with data to build accurate clusters.

2. HNSW (Hierarchical Navigable Small World)

HNSW builds a multi-layered graph index where vectors are connected to their nearest neighbors. Searching this index behaves like traversing a map, quickly jumping across regions to find the closest matches:

Key Characteristics

  • Query Performance: Fastest query speeds, outperforming IVFFlat by a significant margin.
  • Accuracy: Excellent nearest-neighbor search accuracy.
  • Memory Footprint: High memory usage because the graph structure is kept in RAM.
  • Build Speed: Extremely slow index build times compared to IVFFlat.
  • Requirement: Can be created on empty tables; updates dynamically as new rows are inserted.

3. Comparison Summary

Metric IVFFlat HNSW
Query Speed Fast Extremely Fast
Memory Cost Low High
Build Time Short Long
Accuracy Good Excellent
Recommendation Large datasets with limited RAM Production RAG systems requiring fast response times
Published on Last updated: