Roadmap
Status of Edge in three columns: what’s built, what’s next, what’s been decided against (and why).
Updates land here as decisions are made. The README links to this file as the source of truth for “what’s the current state.”
Built and exercised
Section titled “Built and exercised”These are validated end-to-end against a 10 M Lichess corpus.
Indexer (indexer)
Section titled “Indexer (indexer)”- PGN →
.scoredbin a single MapReduce pass. - Tiered shard count by PGN size (1 / 4 / 16 / 32 / 64).
- Timestamped output directory naming,
--out PATHoverride. - 36-byte
ScoreHeaderwith full PGN metadata (result, termination, ECO, ratings, date, TC, player name indices). - 16-bit move encoding (SMove16: 4-flag, 4-promotion).
- Per-shard player-name dictionaries.
- Per-game
.gameidxsidecar (uint64 byte offsets) for O(matches) bitmap-filtered scans. - Aggregate edge table persisted as
tree.dat(see Tree below). - Chess kernel in
position.hpp:piece_at[64]mailbox + templatedmake_move<Opts>+ state-machine SAN parser + NEON-2-way batched movegen + FIDE-2024+ EP semantics (only hash EP when capture is legal — makes transpositions aggregate correctly).
.scoredb directory format (v2)
Section titled “.scoredb directory format (v2)”metakey=value file (version, num_shards, total_games, etc.).shards/N.moves+shards/N.dict+shards/N.gameidxfiles.tree.datat the scoredb root (header + bucket-offset table + packed Edge[] payload).bitmaps/subdirectory for stored bitmaps.- See DATA-FORMAT.md. (File extension renamed
from
.score→.movesin 2026-05; the format itself is unchanged from v2.)
Query engine (query-engine) — the CSW model
Section titled “Query engine (query-engine) — the CSW model”A query = a boolean combination of conditions over a game’s ply
trace, under a global header-predicate AND prefilter. A condition is
(C, S, W): composite position-predicate, side-quantifier, window-
quantifier. (Grammar: QUERY-LANGUAGE.md; logic:
FORMAL-BASIS.md; outputs: OUTPUT-MODEL.md.)
- Header predicates (~20): result, ECO range, year, ELO bands, TC
category, termination, player names (per-shard dict resolution),
--gm/--master. AND-chained; the cheap first tier (~10 ms / 10M). - Position predicates (C):
--queens-off,--bishop-pair(the comparative pair),--doubled-pawn,--isolated-pawn,--passed-pawn,--rook-on-seventh(absolute 7th),--rook-on-open-file,--king-castled, and--count EXPR(material mini-language:"QBNqbn=0","R=r","P<p", …). - Side quantifiers (S):
--side white|black|either|both|neither. - Window quantifiers (W):
--streak N/--min-streak,--ever,--always(≥1-ply gated),--never,--at-ply,--from-ply,--until-ply,--between-ply. - Edge transforms:
--became/--ceased(1-step-past). - Game-level boolean (DNF):
--cond/--and(AND in a group),--or(between groups),--not(per condition). - Per-predicate floors: doubled/bishop default to
streak:2. - Scan paths: header-only (~10 ms) and BB_ONLY (~0.3 s / 10M;
6.48 ns/ply hot path) via
MaintenanceNeeds+ScanPath. (full-hash path designed, not wired.) - Output: game-bitmap
--output @name|PATH(PMOTE-BM);--input-bitmapprefilter (cascade-AND).
Tree (explorer + tree.dat)
Section titled “Tree (explorer + tree.dat)”- Aggregate edge table persisted by indexer at finalize.
explorer <corpus.scoredb> --fen "FEN"performs FEN → position-hash → bucket scan → moves with W/D/B counts. UCI output today (SAN deferred).- “Explorer” is the chess-player term for a position-tree query tool.
- 10 M Lichess: tree.dat ~2.4 GB, query ~0.15 ms warm.
- FIDE-2024+ EP semantics → transpositions aggregate correctly
(e.g.,
1.Nf3 d5 2.d4≡1.d4 d5 2.Nf3).
Bitmap algebra
Section titled “Bitmap algebra”bitmap-combine:info,count,and,or,xor,sub,not.- PMOTE-BM file format with header, shard table, dense bits.
- Byte-identical validation across set operations.
Aggregation reducers (the OUTPUT half — heatmap + group-by)
Section titled “Aggregation reducers (the OUTPUT half — heatmap + group-by)”The model is result = reduce(boolean-combo-of-(C,S,W)). Two of three
reducers are built: the game-bitmap reducer (above) and aggregation,
on a dedicated per-position scan path (scan_bb_aggregate, separate from
the game-bitmap scan_bb_only). The aggregation path reduces over every
matching position (the DNF-combined C+S predicate; no game-level window
quantifier, no early-exit); game-level filtering stays the header
prefilter + --input-bitmap.
- square-heatmap (
--heatmap): per matching position, increment aheat[color][type][square]table (per-shardu64, overflow-safe). Output PMOTE-HM (merged2×6×64u64 LE). - group-by (
--group-by pawn-structure,--top-n K): bucket matching plies by a derived feature key via an open-addressing hash-map (key → {count, raw}), finalize to top-N. First feature kernel: pawn-structure signature (mix of the white/black pawn bitboards). Output PMOTE-GB. (histogram = same machinery over a small enum key; not yet exposed.) - Validated on the 10.16M corpus: with no predicate,
heat[white-king]summed == group-by grand total == 685,863,447 (the true ply count); under--queens-off, both == 187,591,286. Reducers sample POST-move positions (top group = after1.e4 e5). Game-bitmap path unchanged (killer 152,148). Perf: heatmap ~1.25 s, group-by ~1.66 s. - position-stream (
--positions ref|fen|both,--positions-unique,--limit N): emit a record per matching position — a(shard,game,ply)reference, the board as a FEN (serialized from the board in hand), or both;--outputwrites PMOTE-PS (binary refs) and/or.fentext. The unbounded sibling of the GameSet reducer. Validated on 10M:--queens-offstream == 187,591,286,unique== 182,550,541 distinct,--limitexact. (All three OUTPUT-MODEL reducer families are now built.)
Validation (M4 Max, 10.16 M-game Lichess corpus)
Section titled “Validation (M4 Max, 10.16 M-game Lichess corpus)”- Corpus: 10,164,006 games / 685,863,447 plies (~67.5 plies/game).
- Indexing wall: ~1.79 s (drops 0-move games → 10,164,006 games).
meta total_movesnow reports the true ply total (685,863,447) — previously a capped tree-tuple count (199,672,940); the scan/aggregate population matches it exactly.engine_reset’s start-position init is now a function-local static (C++11 thread-safe), fixing a prior non-atomic lazy-init data race (torn reads under the per-shard worker threads).- Killer rook-endgame query (two
--countconditions unioned): 152,148 matches, ~0.3 s per branch (BB_ONLY). - Header-only query (
--result W= 5,056,391): ~10 ms. - Tree query (single position via
explorer): ~0.15 ms warm. - Engine hot path: 6.48 ns/ply single-thread BB_ONLY.
- All 6 standard perft positions pass.
- CSW operators pinned by exact cross-checks:
never== complement ofever;either+neither== corpus; inclusion-exclusion on white/black/either/both; in-engineOR== externalbitmap-combine or(byte-identical);Ever(Became(P))==Ever(P).
Engine optimization (2026-05-29 campaign)
Section titled “Engine optimization (2026-05-29 campaign)”An 8-commit hardening of query-engine.cpp, every change benchmark- or
behavior-gated:
- Hot path: the BB-scan gating / per-branch / full-Boolean machinery is
monomorphized on a compile-time query shape (
<HasGates, HasPbh>), recovering a +21% regression to neutral (queens-off ~0.06 s / 1M). - One scan driver: the game-bitmap and aggregate paths are unified under a
single
bb_replaydriver withBitmapBody/AggregateBodypolicies (always_inline-flat — a policy over the hot loop costs 27% without it; the benchmark caught it, speculation would have shipped it). - One header parser (the flat CLI routes through
build_header_pred), one output-path resolver, dead code removed — net −63 LOC vs pre-campaign. - Latent bug fixed: the PMOTE-HM heatmap truncated counts to u32 (silent overflow past ~60-100M games); widened to u64 (format v2).
- Group-by hash measured healthy at scale (avg 1.82 probes over 56.7M keys).
Next — short term
Section titled “Next — short term”In rough priority order. None are blocked; pick up the one that matches the next use case.
libedge extraction
Section titled “libedge extraction”Pull the C primitives out of the four CLI binaries into a shared
library (libedge.a + headers). CLI tools become thin wrappers.
Enables the daemon and WASM builds without parallel rewrites.
Estimated work: ~300 lines of refactoring + new headers, mostly mechanical.
Daemon
Section titled “Daemon”A minimal HTTP loopback service exposing the core operations as JSON-RPC or REST endpoints. Sketch in ARCHITECTURE.md.
Estimated work: ~500 lines for the HTTP shell + JSON encoding. Library-side work is already done; this is purely the integration shell.
Threads decoupled from shards
Section titled “Threads decoupled from shards”The indexer currently couples worker threads to shards (each thread writes one shard). For 4-core machines indexing 16-shard corpora, this creates light oversubscription. Switch to a work-queue model: fixed thread pool, chunks pulled by atomic counter, each chunk produces one shard.
Estimated work: ~80 lines for the work queue + chunk-to-buffer decoupling.
Fix indexer game-count drift — RESOLVED (2026-05-28)
Section titled “Fix indexer game-count drift — RESOLVED (2026-05-28)”The 30,933-game discrepancy was the zero-move (abandoned-before-move-1)
games. The indexer now drops move_count==0 games at index, and the
.gameidx / .moves / total_games counters are in lockstep (all
10,164,006). This also fixed a latent .gameidx inconsistency (0-move
games previously got gameidx entries but weren’t counted).
Aggregation reducers (the OUTPUT half) — BUILT (2026-05-29)
Section titled “Aggregation reducers (the OUTPUT half) — BUILT (2026-05-29)”The model is result = reduce(boolean-combo-of-(C,S,W)). square-heatmap
(--heatmap) and group-by (--group-by pawn-structure) now run on a
dedicated scan_bb_aggregate path, emitting PMOTE-HM / PMOTE-GB — see
Built above. The flagship multi-scan analytical feature (extract pawn
structures → frequency → where masters place pieces) is now expressible.
The third reducer, position-stream (--positions ref|fen|both), also
landed (2026-05-29) — all three OUTPUT-MODEL reducer families are now built.
Text-DSL parser (the human query surface) — BUILT (2026-05-29)
Section titled “Text-DSL parser (the human query surface) — BUILT (2026-05-29)”--query "<expr>" is a recursive-descent parser for the locked grammar
(QUERY-LANGUAGE.md) — nested precedence, the full
(C, S, W) condition form, then, value-set brackets — lowering directly to
the same engine structs the flat CLI builds (Option 1, no JSON layer), sharing
the predicate/header builders so the two surfaces can’t drift. Round-trip
validated 9/9 against the flat flags. The canonical JSON AST interchange
(Option 2) is deferred until a non-CLI consumer lands.
then (ordered gating) + per-branch headers + full-Boolean C — BUILT (2026-05-29)
Section titled “then (ordered gating) + per-branch headers + full-Boolean C — BUILT (2026-05-29)”then (ordered gating) + per-branch headers + full-Boolean Cthen(P₁,…,Pₙ) adds ordering via single-pass predicate gating (a latched
per-game fired-flag per condition; a gated condition counts only at/after its
gate fires) — no automaton, no nesting. while/until were dropped:
continuity tied to an event isn’t a chess need (streak covers run-length, AND
covers co-occurrence). Per-branch headers (a header inside an OR-branch) make the
killer union a single (C AND result:W) OR (C AND result:B) query — 152,148
bit-identical, one scan. And a condition’s composite C is now a full Boolean
(AND/OR/NOT) over its predicate leaves, not just an AND.
WASM build
Section titled “WASM build”Compile libedge to WASM, expose a JS-friendly API. Acknowledge the
4 GB memory ceiling and async filesystem constraints in docs. Smaller-
corpora tier; not the canonical delivery target.
Estimated work: ~200 lines of binding code + an Emscripten/wasi build setup. Filesystem access via OPFS or virtual mounting.
Next — medium term
Section titled “Next — medium term”Features that we’ve discussed but haven’t designed in detail.
Match-record streaming
Section titled “Match-record streaming”For app-level aggregations beyond the canonical set, query-engine
emits a stream of (shard_id, game_id_in_shard, ply, board_hash) per
matching position. Apps consume the stream and aggregate however they
want.
Open questions: streaming format (binary vs JSON-lines)? Include board
hash or board snapshot? How does this interact with min-streak
(emit on each matching ply, or only when the streak fires)?
Composable position sequences
Section titled “Composable position sequences”Multi-position patterns within a single game. E.g., “find games where white has the bishop pair at one point AND later reaches a queenless endgame.” Currently expressed via bitmap intersection, but within-game sequences aren’t equivalent to game-set intersections (the ordering and same-game constraint matter).
Open design space; no concrete spec yet.
Position-derived opening classification
Section titled “Position-derived opening classification”Instead of relying on the PGN [ECO] tag, classify each game’s
opening by the actual moves played. Useful when corpora have missing
or inconsistent ECO tags. Requires an opening-table data structure
matched at index time.
Estimated work: ~500 lines once we have a position-table representation chosen.
Sparse bitmap encoding
Section titled “Sparse bitmap encoding”For corpora with many stored very-sparse bitmaps (e.g., 500 named ECO buckets), dense bitmaps waste disk. Add Roaring or sorted-ID encoding alongside dense, choose per-bitmap at write time based on density. Reader auto-detects via a header flag.
Deferred until disk pressure becomes real; the trigger would be ≥30 stored bitmaps with most under 5% density.
Negative position predicates
Section titled “Negative position predicates”Sub-FEN currently expresses inclusion (“this square has this piece”); no clean way to express exclusion (“no white pawn on this file”). Workarounds via piece-count rules cover some cases but not all.
Design path: extend sub-FEN with a ! prefix for negation, or add a
companion --no-piece-on FILE PIECE flag. Both have ergonomic
tradeoffs; defer until a real use case forces the decision.
SAN output in explorer
Section titled “SAN output in explorer”Currently emits UCI moves (e2e4, e7e8q). UCI is unambiguous and trivial to decode from SMove16; SAN needs full legal-move generation + disambiguation logic (~80 lines). Defer until a real consumer asks for SAN output specifically.
Per-bucket sort in tree.dat
Section titled “Per-bucket sort in tree.dat”Currently the linear scan over ~73 K edges per bucket is ~0.15 ms
warm. Sorting buckets by prev_hash at write time would drop this
to microseconds via binary search. Not worth doing until a workload
demands it; 0.15 ms is already faster than any UI cares about.
score-extract utility
Section titled “score-extract utility”Walk a bitmap and dump matching games’ PGN text from the source PGN to stdout (or to a new PGN file). Closes the loop on “find these games → see them.”
Estimated work: ~150 lines. Uses pgn_offset from each record’s header.
Folder-mode ingest
Section titled “Folder-mode ingest”Index a directory of multiple PGN files into one .scoredb. Either
concat the PGNs into one logical stream during indexing, or process N
inputs into N corpora. Decision deferred until a real use case clarifies
the desired shape.
Decided against (with reasons)
Section titled “Decided against (with reasons)”--aggregate branch-tree flag in query-engine
Section titled “--aggregate branch-tree flag in query-engine”Why not: The tree builder already does this aggregation during indexing. Filtered branch trees are the same machinery run over a filtered subset of games (via input bitmap). Adding a flag to the query engine for this would be redundant. Build the filtered-tree runner instead.
Era aliases (--modern-era, --engine-era, etc.)
Section titled “Era aliases (--modern-era, --engine-era, etc.)”Why not: Year ranges are clear enough as primitives. --year 1985-2010
is no harder than --modern-era, and era boundaries are arbitrary.
Reviewed at the alias-design discussion; user-vetoed.
Opening-name aliases (--sicilian, --french, etc.)
Section titled “Opening-name aliases (--sicilian, --french, etc.)”Why not: The full set would balloon (every ECO chapter has its own opening name) and the mappings are fuzzy (Sicilian vs Sicilian sub-lines vs transposition cases). Year/ECO ranges suffice for now. Will revisit if a clean named-opening registry emerges.
Material macros (--queens-off, …) — REVERSED, now built
Section titled “Material macros (--queens-off, …) — REVERSED, now built”--queens-off, …)Originally declined in favor of --count. Reversed: named position
predicates (--queens-off, --bishop-pair, --doubled-pawn, …) are now
the readable, discoverable surface, with --count EXPR as the power tool
underneath. Both coexist — named predicates encode chess concepts (e.g.
bishop_pair is the comparative advantage, not just “two bishops”) that
--count can’t express.
Built-in --rp-endgame-* rules
Section titled “Built-in --rp-endgame-* rules”Removed. The original implementation baked in five rook-pawn-endgame
predicates (entry-strict + permissive variants). Replaced by
composition of --count primitives. Validated to match exactly on the
permissive variants; entry-strict semantics were experimental and
unused.
Multi-bitmap archive format (.bix)
Section titled “Multi-bitmap archive format (.bix)”Deferred. Initially considered for storing many precomputed bitmaps
in a single file. The directory-as-database approach (the .scoredb
form) achieves the same outcome — bitmaps live in bitmaps/ as
individual files. Archive format adds a directory layer we don’t need.
Roaring bitmaps as default encoding
Section titled “Roaring bitmaps as default encoding”Deferred. For our hot path (prefilter scan with bit-per-game test), dense bitmaps are faster (cache-friendly, branch-predictable). Roaring would only earn its keep for very-sparse stored bitmaps with disk pressure, which we don’t have yet. Format reserves an upgrade path.
Putting the C engine inside Tabia (as WASM addon)
Section titled “Putting the C engine inside Tabia (as WASM addon)”Decided against. The runtime mismatch (single-game JS vs batch C), the 4 GB WASM ceiling, the threading complexity, and the data-shape mismatch (move tree with annotations vs linear SAN stream) all argue against unification. Edge is a sibling stack project, not a Tabia module. Apps compose both at the app boundary.
Persisting ECO bitmaps at index time
Section titled “Persisting ECO bitmaps at index time”Decided against. ECO is one uint16_t per game in the header; a scan filter takes ~50 ms over 10 M games. Pre-materializing 500 ECO bitmaps would consume disk for negligible speedup. Header-based filters stay header-based.
Generic OR composition inside query-engine — REVERSED, now built
Section titled “Generic OR composition inside query-engine — REVERSED, now built”query-engineOriginally decided against (OR at the bitmap layer via bitmap-combine).
Reversed once the CSW condition model landed: game-level OR/NOT
over conditions (DNF: --or / --not) is now in the engine, byte-identical
to the external bitmap-combine or, and avoids the 2-scan-then-combine
round-trip for the common disjunctive query. bitmap-combine is still the
path for composing persisted bitmaps; per-branch-header unions (headers
inside OR-branches) are now in-engine too — a single DNF query.
Open architectural questions
Section titled “Open architectural questions”These need a decision but don’t block current work.
Daemon API protocol
Section titled “Daemon API protocol”REST vs JSON-RPC vs WebSocket-streaming? Probably hybrid: REST for synchronous operations, WebSocket for long-running ones (indexing progress, large scan results).
Tree edge encoding for sparse positions
Section titled “Tree edge encoding for sparse positions”Most positions in the tree have very few outgoing edges (1–10 for the median game). Tight encoding matters for cache density at lookup time. Choose between sorted-array-per-position vs hash-table-per-position vs trie. Defer until the persistence work.
libedge ABI stability
Section titled “libedge ABI stability”Once we have downstream consumers (daemon, WASM, Node bindings, …), ABI breaks become costly. Versioning policy + header organization matters. Decide before the extraction lands.
Cross-machine .scoredb portability
Section titled “Cross-machine .scoredb portability”Endianness is the obvious one (all data is little-endian per spec).
Less obvious: shard count is a property of the corpus, not the
machine — so a .scoredb built on one machine should run on another
identically. Validate this once decoupled threading lands.
Versioning
Section titled “Versioning”Edge versions independently from PROMOTE’s JS projects. The internal
data formats (scoredb v1, PMOTE-BM v1) are stable; bumping a
format version requires re-indexing existing corpora.
API breaks in libedge will follow semver once the extraction lands
and we ship a 1.0.