RaBitQ
Randomized rotation + per-dimension bit packing with a provable distance-error bound. 3-bit codes hit >99% Recall@10 on SIFT1M when paired with a rerank pass.
RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search
Gao, J.; Long, C.
SIGMOD 2024
Proposes a randomized 1-bit-per-dimension quantizer with a provable unbiased distance-error bound and SIMD-friendly distance estimation.
Read paper at official venue →Recall vs bits
Low-bit RaBitQ is a coarse filter — on its own the estimated distances are noisy.
Pair it with a rerank pass (rerank=N): pull the top
k×N candidates, re-score them against the authoritative FLOAT[d] column. The operator plan is
uniform regardless of N: TOP_N ← PROJECTION ← VINDEX_INDEX_SCAN.
| bits | No rerank | +10× rerank | +20× rerank | Bytes / vec (d=128) |
|---|---|---|---|---|
| 1 | ~0.40 | ~0.85 | ≥0.90 | 28 B |
| 2 | ~0.60 | ~0.95 | ≥0.97 | 44 B |
| 3 (default) | ~0.80 | ≥0.98 | ≥0.99 | 60 B |
| 4 | ~0.90 | ≥0.99 | ≥0.99 | 76 B |
| 8 | ~0.99 | ≥0.99 | ≥0.99 | 140 B |
| float32 | 1.00 | 1.00 | 1.00 | 512 B |
Example
CREATE INDEX docs_idx ON docs USING HNSW (embedding)
WITH (metric='cosine', quantizer='rabitq', bits=3, rerank=10);