Execution Model
Status: architecture (2026-05-29), normative. The conceptual spine the query language, predicate library, and output model all hang off — how a scan computes. PLANNER.md is how this shape is lowered and optimized; ARCHITECTURE.md is the physical engine. The engine already realizes this model implicitly; this doc names it so every other piece slots into one place.
A scan is a single-pass stateful fold over the ply trace
Section titled “A scan is a single-pass stateful fold over the ply trace”A scan walks each game’s plies once, left to right, carrying a bounded
state and emitting outputs. Functionally it is a mapAccum (a Mealy
machine): at each ply,
(state', emissions) = step(state, board, headers)The expensive part — the replay (make_move + board maintenance) — is
paid once per game, no matter how many queries or outputs ride the scan.
Everything else is a transform over the position the replay produced. (This
is why N queries / N reducers / N bitmaps on one scan cost ~the same as one —
see “cost”, below.)
Inputs at a ply
Section titled “Inputs at a ply”- the board after this ply;
- the game headers — constant per game (result, ECO, ratings, …); they live in the 36-byte record and are available throughout the scan;
- the carried state: window counters, latched flags (conditions / gates
already satisfied), maintained tallies (e.g. a running material count),
1-step prior-bits (for
became/ceased), side-to-move, and the reducer accumulators.
Outputs at a ply
Section titled “Outputs at a ply”- the updated carried state, fed to the next ply;
- any emissions triggered this ply (a reducer firing).
The transform is five stages
Section titled “The transform is five stages”step is not one operation; it is a short forward dataflow. Naming the
stages is the point — each is a distinct kind of thing, and almost every
piece of the language and output model lives at exactly one.
- Measure → features.
board → value. A measurement: material signature, pawn-structure hash, endgame-type, king-safety, piece placements, #passed-pawns. Pure; it yields a value, not a bool. Most features recompute per ply; some are maintained incrementally (a running material tally updated by a capture/promotion delta, like the Zobrist hash). - Test → predicates.
feature ⊕ comparison → bool. A predicate is a feature thresholded:queens-off= “queen-count == 0”;is-rook-endgame= “endgame-type == R+P”. This is theCof a condition. - Track → window state.
bool → counter / flag, with gating. Advance or reset a streak counter; latch a “fired” flag when a window is satisfied; suppress a counter until its gate fires. This is the per-condition temporal fold — the carried state’s update — and its semantics are grounded in FORMAL-BASIS.md (LTLf over the finite trace). The side quantifierSrides here too (testCper side, combine per either/both/neither). - Combine → verdict. Flags → the query’s boolean result: the DNF of conditions, headers global or per-branch. “Does this game/position match?”
- Reduce → outputs. verdict and/or features → update an accumulator /
emit. Set a game bit (GameSet); bump a heatmap counter by a feature (a
piece square); route by a group-by key feature; append a
(game, ply)or a FEN. Reducers consume the verdict, features, or both — not just a match-bool.
graph LR BOARD["board<br/>after each ply"] --> M["1 · Measure"] M -->|features| T["2 · Test"] T -->|predicates| TR["3 · Track"] TR -->|flags| C["4 · Combine"] C -->|verdict| R["5 · Reduce"] M -. "feature substrate" .-> R R --> OUT["outputs<br/>game bitmap · heatmap · position stream"] classDef stage fill:#2563eb,color:#fff,stroke:#93c5fd,stroke-width:2px; class M,T,TR,C,R stage;The feature substrate feeds both the filter (via predicates) and the
output (via reducer keys/values). That is why a measurement like
endgame-type is neither a predicate nor a reducer — it is a stage-1
feature that a predicate thresholds (to filter to rook endgames) and a
group-by reducer keys on (to classify into all endgame types).
The bounded-state invariant
Section titled “The bounded-state invariant”What keeps this a single forward pass — rather than an arbitrary stream transducer that would wreck performance and optimizability — is a deliberate discipline:
- Bounded state. The carried state is
O(query size + output size)per game — counters, flags, tallies, accumulators — neverO(plies). No stage keeps a history of past boards; the one concession is the 1-step prior-bit forbecame/ceased. - Acyclic dataflow. Stages flow forward (measure → … → reduce); gates form a DAG (a condition may depend on an earlier one, never a cycle).
O(dag)work per ply — proportional to the query+output size, not the trace length.- Early-exit once the verdicts settle (and all attached reducers permit it).
This is the streaming / co-safety slice of the formalism — the flat fragment of FORMAL-BASIS.md, generalized to the whole dataflow. Nesting temporal operators, unbounded history, or cyclic dependence are out by design: they would force a product automaton and break the single pass.
How the rest of the spec hangs off this
Section titled “How the rest of the spec hangs off this”| Stage | Concept | Spec home |
|---|---|---|
| 1 Measure | feature (board → value) | PREDICATE-LIBRARY (the kernels) |
| 2 Test | predicate C (feature + comparison) | PREDICATE-LANGUAGE / PREDICATE-LIBRARY |
| 3 Track | side S + window W → a condition (C,S,W); gating | QUERY-LANGUAGE, FORMAL-BASIS |
| 4 Combine | a query (boolean combination of conditions + headers) | QUERY-LANGUAGE |
| 5 Reduce | a reducer (GameSet / position output / aggregate) | OUTPUT-MODEL |
Orchestration above a single scan
Section titled “Orchestration above a single scan”graph LR SCAN["scan<br/>one fold / replay over<br/>the candidate set"] SCAN --> Q1["query Q₁"] SCAN --> Q2["query Q₂"] SCAN --> QN["…"] Q1 --> R1["reducer"] Q1 --> R2["reducer"] Q2 --> R3["reducer"] classDef scan fill:#2563eb,color:#fff,stroke:#93c5fd,stroke-width:2px; class SCAN scan;A scan hosts many queries (stages 2–4); each query feeds a list of reducers (stage 5), one-to-many; a reducer belongs to exactly one query (any combination of matches is itself a query). All share the single fold; the candidate set replayed is the union of the queries’ header-prefilters — a game is replayed if any query might want it.
$$ \text{cost(scan)} \approx \text{replay} + \sum \text{features} + \sum \text{predicates} + \sum \text{reduces} $$
The replay dominates and is paid once; features and predicates add sub-replay per-ply work that scales with the query+output size; reduces are cheap folds. Hence: many queries, many reducers, or many output bitmaps on one scan are ~free relative to running that many separate scans.
Why this matters
Section titled “Why this matters”- It homes “feature.” A measurement is a stage-1 object, the substrate a predicate thresholds and a reducer keys on — not a fourth thing wedged between predicate and output.
- It makes the design extensible at the right seam. A new chess concept is a new feature; a new test is a predicate over it; a new output is a reducer. Each is a node at one stage, never a special case bolted on.
- It explains the cost model and the multi-output story in one frame: the shared fold.