THE PRODUCT· V0.1· BENCHMARKS 2026-04-14

ONE BINARY.
THREE STORES.

Logs. Vectors. Agent memory. One Rust engine instead of three glued systems. This page shows the product — real dashboard visualizations on seeded data, measured numbers against Elasticsearch 8.13, and the engine internals that make the numbers possible.

QUERIES OVER TIME · 24H · LIVE SEED
00:00 · 432.3 q/s min 290.27 q/s · peak 1.52K q/s 537.41 q/s · 24:00
01·BEFORE YOU SCROLL

Skip ahead — see it
running in your browser.

The visualizations below are not illustrations. They're the real dashboard components rendered on seeded mock data so you can see what the product looks like before a single config line. Drop a work email if you want a private binary and the reproduction scripts for the benchmark tables.

OR OPEN THE PLAYGROUND · UNLOCK WITH ONE EMAIL · FULL UX ✓ THANKS. CHECK YOUR INBOX WITHIN 24 HOURS.
02·THE UX

TYPE IS
THE UI.
LIVE DATA.

XERJ rejects the Kibana aesthetic — boxed visualizations, shadowed cards, icon-heavy toolbars. Every screen is typography, negative space, and 1 px lines. Every chart below is the same component that runs in the dashboards — same code, same rules, same restraint.

LATENCY · PER MODEL · AXONOMETRIC RIBBON
OPUS 4.61.72KSONNET 4.6969.91HAIKU 4.5395.65GPT-51.09KGEMINI 31.31K
axonometric · 5 series · min 223.03 · max 1.73K

Five models, 60 seconds. Depth encodes model generation — OPUS at the front, HAIKU at the back. Every stroke is a 1 px line.

03·VECTOR INDEX

WHERE YOUR
QUERIES LIVE.

Every query is a point in 1536-dimensional space. The dashboard projects them down to 2D and colors the six most common intents. Fresh queries appear as hollow marks; old ones fade. This is the EmbedSpace primitive — you get it by wiring a dense vector field to the field catalog, no additional config.

830,000 QUERIES · 48 HOURS · UMAP 2D
······························································································································································································································································································································································································································································································································································································································································································································································································································································ RAG RETRIEVALCODE ASSISTDOC Q&AEXTRACT JSONCLASSIFYAGENT TOOL
UMAP · 830 embeddings · 6 clusters
04·SERVICE GRAPH

SEE EVERY
HOP.

Traces aren't a separate product — they're a graph query on the logs index. The ChordArcs primitive renders service → index flows as a text-first sankey. Traffic is real rps; no icon waterfall. Drill through any edge to the underlying spans.

SERVICE → INDEX · FLOWS · 1H AVERAGE
api-gateway auth app search embed-svc fts-index vector-index agent-memory logs metrics SOURCE TARGET
11 flows · 55 · rps
05·QUERY SHAPE

FIND THE
OUTLIER
QUERIES.

Twelve recent queries across six dimensions. Drag a handle on any axis in the live version to filter. The dashboard highlights queries whose recall@10 falls below a threshold — this is how you find the cases your retrieval is failing on.

LATENCY · TOKENS · COST · CACHE · RECALL · DOCS
LATENCY_MS 0.00 2.5K TOKENS 0.00 12K COST_USD 0.00 6.0 CACHE_HIT 0.00 100 RECALL_AT_10 0.00 100 DOCS 0.00 50.0
12 rows · 6 dimensions · 1 highlighted
06·ATTENTION

WHAT THE MODEL
READ.

When a RAG pipeline misses, the first question is "did the model even look at the right words?" XERJ's attention overlay marks the retrieved passage with per-token weight. Accent-yellow tokens are the top 20 % by attention weight. No heat tiles, no color ramp — just opacity.

RAG ANSWER · TOP-3 TOKENS BY ATTENTION WEIGHT
Retrieval augments generation by pulling the most relevant passages from a vector index at query time so the model answers from facts rather than from memorised patterns.
peak attention on Retrieval · by · passages
07·TOKEN BUDGET

WHERE THE
TOKENS GO.

Context windows are big and boring. XERJ visualizes the budget as a left-to-right flow: system prompt, context, question, completion. Width encodes tokens. You spot a bloated context in a glance.

TOKEN BUDGET · 24 HOURS · T = 1,000 TOKENS
SYS PROMPT
84K T · 5.5%
CONTEXT
1.24M T · 81.2%
QUESTION
24K T · 1.6%
COMPLETION
180K T · 11.8%
08·CLASSICS

THE BREAD
AND BUTTER.

Every dashboard still needs a top-N, a distribution, and a heatmap. XERJ ships them as the same 1 px primitives the rest of the visuals are built from — same code, same rules, different shapes.

TOP DOCUMENTS RETRIEVED · 24H
runbook/oncall.md
12.43K
21.5%
rfc/042-retention.md
9.82K
17.0%
arch/cluster-design.md
8.2K
14.2%
rfc/039-hybrid-search.md
7.1K
12.3%
runbook/incident-1411.md
6.4K
11.1%
policy/pii.md
5.2K
9.0%
arch/hnsw-internals.md
4.8K
8.3%
docs/query-dsl.md
3.9K
6.7%
BY MODEL · SHARE OF TRAFFIC
SONNET 4.6 42.0 42.0%
HAIKU 4.5 28.0 28.0%
OPUS 4.6 18.0 18.0%
GPT-5 8.0 8.0%
GEMINI 3 3.0 3.0%
OTHER 1.0 1.0%
SONNET 42% · HAIKU 28% · OPUS 18% · GPT-5 8% · GEMINI 3% · OTHER 1%
SPEND · WEEKDAY × 2H
 000204060810121416182022
MON$2$2$13$43$89$135$169$167$141$95$54$16
TUE$2$2$9$56$95$148$165$170$141$108$44$2
WED$2$2$2$54$91$148$166$160$135$102$59$11
THU$2$2$9$47$104$139$167$165$131$104$59$14
FRI$2$2$4$48$101$132$162$153$148$99$56$7
SAT$2$2$6$19$38$58$85$81$65$46$21$4
SUN$2$2$6$18$40$61$74$71$64$52$18$3
OPEN THE LIVE PLAYGROUND
09·MEASURED VS ELASTICSEARCH 8.13

NUMBERS YOU
CAN REPRODUCE.

Every row below comes from a dated, checked-in battle report in the engine repository. No synthetic micro-benchmarks, no hand-picked best run — the reports include repro scripts and hardware profiles. Where ES shards its cluster across 4 nodes, we run XERJ on 1. The caveats are published alongside the wins.

TEST · 1M EVENTS · P95 LATENCY
ELASTICSEARCH
XERJ
DELTA
SIEM top-source-IPs · terms aggregation
29.8 ms
0.4 ms
74×
SIEM_BATTLE_2026-04-14_184900_UTC.md §2
SIEM 16-query analyst battery · median
4.1 ms
0.6 ms
6.8×
SIEM_BATTLE_2026-04-14_184900_UTC.md §2
NGINX terms aggregation · 5 distinct values
35.7 ms
0.4 ms
89×
CLUSTER_BATTLE_2026-04-14_180241_UTC.md §3
NGINX keyword term query · method=GET · warm cache
2.0 ms
0.3 ms
6.7×
HEAD_TO_HEAD_M3_2026-04-14.md
Cold start · JVM warm-up vs static binary
~15 s
50 ms
300×
SIEM_BATTLE_2026-04-14_184900_UTC.md §4
Release binary size · deploy footprint
620 MB
11 MB
56×
SIEM_BATTLE_2026-04-14_184900_UTC.md §4
Idle memory · 1-node XERJ vs 4-node ES heap
8.5 GB
400 MB
21×
SIEM_BATTLE_2026-04-14_184900_UTC.md §4
Disk footprint · 1M SIEM events · post-ingest
240 MB
85 MB
2.8×
SIEM_BATTLE_2026-04-14_184900_UTC.md §4
10·ENGINE

WHAT MAKES
IT FAST.

Speed is not one trick. XERJ is fast because everything happens in one process, in one language, against data laid out exactly the way the queries want to read it. Below are the six choices that account for most of the delta.

No JVM
Rust compiles to a static binary with no heap, no GC pauses, no warm-up, no page fault storms after a restart. Cold start is 50 ms, not 15 s.
Columnar native
Every column is encoded at write time — delta-of-delta for timestamps, dictionary + ZSTD for keywords, FOR+RLE for small integers, SQ8 for vectors.
Pre-computed term histograms
The 74× SIEM win comes from keeping a per-column value histogram at ingest time. Terms aggregations read it directly instead of scanning posting lists.
Hybrid planner
BM25 and vector search share one query tree, one cost model, and one execution pass. No RRF orchestrator, no two-system round trip.
Single-process ingest
The NGINX/JSON/syslog/OTLP parsers run in the same process as the writer. One fsync, one network hop, end-to-end back-pressure.
Explain plan is a feature
/v1/explain-plan returns the exact optimizer tree with per-node cost and row counts.
GO BACK TO DEMO REQUEST