04 · DATA MODEL
Vectors & HNSW
XERJ ships HNSW as the only vector index — it's the one that works at the scales we care about. Distance metrics, graph parameters, and quantization are all tunable from config or per-query.
HNSW graph parameters
KEY
TYPE
DEFAULT
DESCRIPTION
hnsw_m
u32
16
Bi-directional edges per layer. Higher = better recall, more RAM.
hnsw_ef_construction
u32
200
Beam width at index-build time. Higher = better graph, slower writes.
hnsw_ef_search
u32
100
Default beam width at query time. Override per-query via the KNN request.
default_metric
enum
"cosine"
"cosine" · "dot_product" · "euclidean".
Quantization
XERJ supports four quantization modes. scalar8 (SQ8) is the default and gives ~4× memory reduction with almost no recall loss on typical embedding spaces.
- none — full-precision f32. Use when recall matters more than RAM.
- scalar8 — 8-bit per dimension. Default. ~4× smaller. Recall impact typically under 1%.
- scalar4 — 4-bit per dimension. ~8× smaller. Noticeable recall hit on fine-grained spaces.
- binary — 1-bit per dimension with Hamming distance. ~32× smaller. Use only if you know the embedding space tolerates it.
KNN query
{
"knn": {
"field": "embedding",
"query_vector": [0.12, 0.08, -0.31, ...],
"k": 20,
"num_candidates": 200,
"ef_search": 180
}
}
Hybrid — BM25 + KNN in one planner pass
{
"hybrid": {
"fusion": "rrf",
"queries": [
{ "match": { "message": "kernel panic on reboot" } },
{ "knn": { "field": "embedding", "query_vector": [...], "k": 50 } }
]
}
}
Source · engine/crates/vector/src/hnsw.rs
◀ PREVAnalyzers
NEXT ▶Ingest pipelines