Quantizer

FLAT

No compression — each vector stored as the original FLOAT[d]. Baseline for recall; use when memory isn't the bottleneck.

When to use it

Flat stores the raw float32 vector inside the index — no compression, no rerank needed. It is the correctness baseline: recall equals brute-force up to graph-search pruning. Choose it when the dataset fits in memory at float32 and absolute recall matters more than memory footprint.

Example

CREATE INDEX docs_idx ON docs USING HNSW (embedding)
    WITH (metric='cosine', quantizer='flat');