04 · DATA MODEL

Vectors & HNSW

XERJ ships HNSW as the only vector index — it's the one that works at the scales we care about. Distance metrics, graph parameters, and quantization are all tunable from config or per-query.

HNSW graph parameters

KEY

TYPE

DEFAULT

DESCRIPTION

hnsw_m

u32

Bi-directional edges per layer. Higher = better recall, more RAM.

hnsw_ef_construction

u32

200

Beam width at index-build time. Higher = better graph, slower writes.

hnsw_ef_search

u32

100

Default beam width at query time. Override per-query via the KNN request.

default_metric

enum

"cosine"

"cosine" · "dot_product" · "euclidean".

Quantization

XERJ supports four quantization modes. scalar8 (SQ8) is the default and gives ~4× memory reduction with almost no recall loss on typical embedding spaces.

none — full-precision f32. Use when recall matters more than RAM.
scalar8 — 8-bit per dimension. Default. ~4× smaller. Recall impact typically under 1%.
scalar4 — 4-bit per dimension. ~8× smaller. Noticeable recall hit on fine-grained spaces.
binary — 1-bit per dimension with Hamming distance. ~32× smaller. Use only if you know the embedding space tolerates it.

KNN query

{
  "knn": {
    "field":      "embedding",
    "query_vector": [0.12, 0.08, -0.31, ...],
    "k":          20,
    "num_candidates": 200,
    "ef_search":  180
  }
}

Hybrid — BM25 + KNN in one planner pass

{
  "hybrid": {
    "fusion": "rrf",
    "queries": [
      { "match": { "message": "kernel panic on reboot" } },
      { "knn":   { "field": "embedding", "query_vector": [...], "k": 50 } }
    ]
  }
}

Source · engine/crates/vector/src/hnsw.rs

◀ PREVAnalyzers

NEXT ▶Ingest pipelines