Predicate Language
Status: TEMPLATE. Sections below are scaffolds for collaborative authoring. The “open questions” lines mark spots where we still need to make decisions; remove them as they’re resolved.
Status (2026-05-29): the C-S-W engine + the
--querytext-DSL parser are built (QUERY-LANGUAGE); the parser lowers directly to the engine’s query structs (Option 1). This doc’s canonical JSON AST is the future interchange (Option 2) — the EBNF below predates the C-S-W condition model (no side-quantifiers / simultaneity / DNF / became-ceased) and will be redefined when a JSON consumer (workbench GUI / daemon) lands. For what actually runs, see README → Status.
The Edge query language is an algebra of typed predicates over the corpus. This doc defines the canonical form — the abstract syntax tree (AST) that every frontend (CLI flags, GUI widgets, NLP translation) produces. The engine evaluates the canonical form; the surface syntax is downstream.
Sibling docs:
- PREDICATE-LIBRARY.md — the chess-domain primitives (the named predicates the language composes).
- PLANNER.md — how the engine executes a predicate AST.
- OUTPUT-MODEL.md — the output/reducer axis (the third axis alongside predicate + quantifier).
- OPENING-ALIASES.md — opening-name → ECO resolution (one specific predicate-input transform).
- QUERY-LANGUAGE.md — the CLI surface today.
Why an algebra, not “a list of flags”
Section titled “Why an algebra, not “a list of flags””Cover: design pressure of composability strength; predicate-language is the source of truth, frontends translate to it; type system enforces unambiguity; downstream consumers (CLI / GUI / NLP) all produce canonical AST.
Type system
Section titled “Type system”Cover: the four core types and what they represent.
PositionPred—Board → bool. A check against a single board state. Examples:bishop_pair,doubled_pawn(file=d).HeaderPred—GameHeader → bool. A check against the 36-byte PGN header. Examples:eco_in(B20-B99),both_elo_ge(2500).GamePred—Game → bool. A check against the whole game (a ply sequence). Produced by lifting aPositionPredwith a quantifier. Examples:Ever(bishop_pair),Streak(passed_pawn, 8).GameSet— a set of matched games. The output of a query; the input to set algebra.
Resolved 2026-05-28: no
MovePredtype. A predicate is a pure function of the current board + declared cross-ply state — adjacent-ply (“move”) conditions are a 1-bitBecame(P)/Ceased(P)edge over a position predicate. The output/reducer is likewise not a type; it’s aScanattribute (a list — multi-output). See OUTPUT-MODEL.md #1, #4, #7.
Composition rules
Section titled “Composition rules”Cover: which operators preserve which types. The legal-compositions table.
Operator Inputs → Output type───────────────── ──────────────── ──────────────∧, ∨, ¬ PositionPred* → PositionPred∧, ∨, ¬ HeaderPred* → HeaderPredquantifier (see) PositionPred → GamePred∧ GamePred × GamePred → GamePred∧ HeaderPred × GamePred → GamePred (composes across tiers)set algebra GameSet × GameSet → GameSetOpen question: do we allow mixing
PositionPredandHeaderPredat the position-level AND? E.g., “on a ply where bishop pair AND game’s ECO is B20-B99” is degenerate (the second is constant per-game), but what aboutOnPly(P, n) AND HeaderPred? It probably collapses.
Quantifiers — lifting PositionPred → GamePred
Section titled “Quantifiers — lifting PositionPred → GamePred”Cover: the modifier vocabulary. Each entry: name, signature, semantics, performance notes.
Existential / universal:
Ever(P)— ∃ ply where P holds. The default if no quantifier specified.Never(P)— ∀ plies: ¬P holds.Always(P)— ∀ plies: P holds.
Streak (min-consecutive):
Streak(P, n)— ∃ ≥n consecutive plies where P holds. (Inherited from v1’s--min-streak.)
Positional bounds:
AtPly(P, n)— P holds at the specific ply n.FromPly(P, n)—Ever(P)restricted to plies ≥ n.UntilPly(P, n)—Ever(P)restricted to plies ≤ n.BetweenPly(P, lo, hi)—Ever(P)restricted to plies in [lo, hi].
Phase-based:
InPhase(P, phase)—Ever(P)restricted to a phase classifier (opening / middlegame / endgame). Phase definition lives in PREDICATE-LIBRARY.
Temporal sequence (harder — V2?):
Then(P, Q)— ∃ plies i < j: P at i AND Q at j.While(P, Q)— ∃ plies i < j: Q at j AND P holds at all plies in [i, j].Until(P, Q)— ∃ ply i: P holds at all plies < i, Q at i.
Open question: temporal-sequence quantifiers double the implementation complexity (state machines over the ply stream). Worth gating to V2?
Open question: streak vs Always-within-range.
Streak(P, n)is “P for n consecutive plies somewhere”;Alwaysis “every ply”. Is there a use case for “every ply for ≥n consecutive starting from move M”?
Set algebra — GameSet × GameSet → GameSet
Section titled “Set algebra — GameSet × GameSet → GameSet”Cover: the cross-query composition layer. AND, OR, XOR, SUB, NOT. When does the user reach for this vs in-query AND-chain?
- In-query AND-chain: when all conditions must hold within one game.
- Set algebra: when composing INDEPENDENT query results.
The “Sicilian OR French” case → run two queries, OR the bitmaps. The “Sicilian AND master” case → one query, both predicates AND’d.
Canonical AST grammar
Section titled “Canonical AST grammar”Cover: the formal grammar of the canonical form. EBNF or similar. This is what frontends produce and what the planner consumes.
Query ::= GameSetExprGameSetExpr ::= Scan | SetOpScan ::= "scan" Corpus PredicateSetOp ::= "and" GameSetExpr+ | "or" GameSetExpr+ | "sub" GameSetExpr GameSetExpr | "xor" GameSetExpr GameSetExpr | "not" Corpus GameSetExpr
Predicate ::= GamePred | PredicateAndPredicateAnd ::= "and" Predicate+GamePred ::= HeaderPred | LiftedPredLiftedPred ::= Quantifier PositionPredQuantifier ::= "ever" | "never" | "always" | "streak" Number | "at-ply" Number | "from-ply" Number | "until-ply" Number | "between-ply" Number Number | "in-phase" Phase | "then" Predicate Predicate | "while" Predicate PredicatePositionPred ::= NamedPositionPred (Param*) | PositionAnd | PositionOr | PositionNotHeaderPred ::= NamedHeaderPred (Param*) | HeaderAnd | HeaderOr | HeaderNotOpen question: do we use a Lisp-y s-expression form, JSON, or a custom syntax? The canonical form is what AST processors operate on; the surface form is downstream. (For NLP-translation purposes, JSON is the obvious target.)
Worked translations
Section titled “Worked translations”Cover: 5-10 worked examples translating natural-language queries into canonical form. These also serve as test cases for any frontend.
Example: “Sicilian Najdorf, both 2500+, white wins, blitz.”
and( header_eco_in(B90-B99), header_both_elo_ge(2500), header_result(W), header_tc_category(blitz))Example: “Games where white had the bishop pair sustained ≥10 plies starting from move 15.”
and( streak(white_bishop_pair, 10) within from_ply(30, ...))Open question: how do we cleanly express the composition of
streak
from-ply? Two quantifiers on the samePositionPred. Is the right shapeStreak(FromPly(P, 30), 10)(apply bound first, then streak over the restricted range)?
(More examples to follow.)
Open design decisions
Section titled “Open design decisions”Resolved 2026-05-28 (output-axis decisions in OUTPUT-MODEL.md):
- Canonical unit: ply (half-move). “Move N” is a frontend input/display convenience (move N = plies 2N−1, 2N).
- Surface form: JSON is the canonical AST — machine interchange, what frontends emit and the engine/daemon consume. Ergonomic complex-query input (a text DSL / GUI) is a frontend concern, deferred until a consumer needs it.
- No
MovePredtype. Predicate = current board + declared cross-ply state (quantifier counter / 1-bitBecame/Ceasededge / maintained tally / rarely, prior boards). See OUTPUT-MODEL #4.- Stacked quantifiers: bound first, then streak —
Streak(FromPly(P, k), n)(JSON nestsstreak { of: from_ply { of: P }}).- Side is a predicate parameter, not a variant:
isolated_pawn(side, …).- Temporal-sequence quantifiers (
Then/While/Until): V2.- Output/reducer is a
Scanattribute (a list — multi-output), not a core type. See OUTPUT-MODEL #1, #7.Still open:
- Phase classifier: piece-count / move-number / material / hybrid? (Chess-domain call — decide when the first phase-dependent predicate is built.)
- Type for “headerless” mode (no ECO / tags in the PGN)?
- Mixing
PositionPred∧HeaderPredat the position level (likely collapses; confirm).