Predicate Library

Status: TEMPLATE. Sections below are scaffolds for collaborative authoring. Each predicate entry will eventually carry its exact bitboard expression and its quantifier defaults; for now we’re sketching the catalog — what’s in and what’s not.

Implemented today: header predicates + a first slice of position predicates — queens_off, bishop_pair, doubled_pawn, isolated_pawn, passed_pawn, rook_on_seventh, rook_on_open_file, king_castled (side-parameterized), and material_count (the --count expr). The remaining entries below are target design. For what actually runs, see README → Status.

The chess-domain primitives the PREDICATE-LANGUAGE composes. Every named predicate here is a function on a single board state (a PositionPred) or a single header (a HeaderPred). The library grows over time as new chess concepts get use cases.

This is where the chess knowledge lives. The language is generic; the library is opinionated chess.

Predicate specification format

Cover: what every entry in this doc must include. Standardize the entry shape so library additions are auditable.

Each predicate entry is:

### name
**Type:** PositionPred | HeaderPred
**Signature:** name(param1, param2, ...) — parameter list
**Definition:** plain-English semantics
**Bitboard expression:** the exact C/C++ expression
**Cost:** approximate cycles per evaluation
**Selectivity hint:** approximate fraction of games matching `Ever(P)`
                     in a typical mixed corpus (advisory only — the
                     planner can refine empirically)
**Notes:** edge cases, related predicates, history

Header predicates (Tier 1: fastest)

All header predicates evaluate in one shared pass over the 36-byte ScoreHeader, ~7 ms over 10M games regardless of count. They are always run first by the planner.

List with one-line definitions:

eco_in(range) — ECO tag in a closed range like B20-B99
eco_eq(code) — exact ECO match
result(W|B|D|decisive) — game result
termination(...) — termination reason (normal / time / abandoned / …)
white_elo_ge / le / between(...) — ELO bands, white
black_elo_ge / le / between(...) — ELO bands, black
both_elo_ge(N) — both players ≥ N
tc_category(...) — bullet / blitz / rapid / classical / correspondence
year_eq / year_in(range) — game date
white_name(...) / black_name(...) / played_by(...) — by player
min_ply / max_ply(N) — game length bounds

Open question: how do we name the “either-side” predicates cleanly? played_by(name) for “name is on either side”; white_name(name) / black_name(name) for specific side. Or any_player(name)?

Open question: do we want compound aliases here, like gm for both_elo_ge(2500) or master for both_elo_ge(2200)? Or do those live one layer up (the language frontend resolves the alias)?

Position predicates

The bitboard-native chess predicates. Each operates on a single Board and returns bool. Grouped by topic.

Material

bishop_pair(side) — side has a bishop of each color AND the opponent does not (the bishop-pair advantage, not merely owning two bishops): (side.B & LIGHT) && (side.B & DARK) && !((opp.B & LIGHT) && (opp.B & DARK))
opposite_bishops — both sides have exactly one bishop, on opposite colors
same_color_bishops — both sides have exactly one bishop, on same color
queens_off — no queen on the board
two_minors(side) — popcount(side.N | side.B) == 2
material_count(expr) — flexible count expression, generalizes v1’s --count
material_balance(score) — total material score for white minus black
equal_material — material balance == 0
pawn_count(side, op, n) — popcount(side.P) op n

Open question: do we want a single flexible count(expr) primitive (v1-style "R=r", "QBNqbn=0") or named-shape primitives (equal_rooks, no_minors_no_majors)?

Pawn structure

doubled_pawn(side, file?) — side has ≥2 pawns on a file (any file by default)
isolated_pawn(side, file?) — pawn with no friendly pawns on adjacent files
passed_pawn(side, file?) — pawn with no enemy pawns blocking on file or adjacent
backward_pawn(side, file?) — pawn unable to advance, no friendly support
pawn_chain(side, min_length) — diagonal chain of N pawns
locked_center — heuristic; pawns on d-/e-files of both sides immobile
open_file(file) — no pawns of either side on this file
half_open_file(side, file) — no friendly pawns; enemy pawns present

King safety

king_castled(side) — king has moved to a castled square
king_castled_kingside(side) — specifically kingside (g1 / g8)
king_castled_queenside(side) — specifically queenside (c1 / c8)
opposite_castled — sides castled on opposite wings
king_in_center(side) — king on its starting rank, central files
king_exposed(side) — heuristic; few friendly pawns adjacent to king

Piece activity

rook_on_seventh(side) — at least one rook on side’s 7th relative rank
rook_on_open_file(side) — at least one rook on a file with no pawns
rook_on_half_open(side) — at least one rook on side’s half-open file
knight_on_outpost(side, rank_floor) — knight on a square with no enemy pawn attack and where pawn support is in place
bishop_on_long_diag(side) — at least one bishop on a long diagonal
bishop_fianchetto(side) — bishop on g2/b2 (white) or g7/b7 (black)
knight_on_edge(side) — knight on a- or h-file
queen_developed(side) — queen has moved off its home square

Tactical / threat (engine-evaluated)

Cover: these need engine help — they’re not pure bitboard pattern matches. They cost more per evaluation.

in_check(side) — side to move is in check
attacked_piece(side, type) — side has a piece of type under attack
hanging_piece(side, type) — attacked piece with no defender
pinned_piece(side) — side has a pinned piece
discovered_check_available(side) — side can play a discovered check

Open question: how deep into tactical territory does this library go? Predicates like mate_in_one are technically expressible but very expensive per evaluation. Do they live here, or in a separate “tactical- patterns” library that’s opt-in?

Game-state

move_made(piece, from?, to?) — at this ply, a specific move was just played
capture_made — at this ply, the previous move was a capture
promotion_made(promo_type?) — at this ply, a pawn just promoted
castled(side) — at this ply, side just castled

Open question: these are move-of-the-ply predicates, distinct from position-state predicates. Different evaluation pattern. Same PositionPred type or a separate one?

Phase classifier

Cover: definition of opening / middlegame / endgame for use with InPhase(P, phase).

Phase is one of: opening, middlegame, endgame. The classifier runs on each ply and returns the current phase.

Open question: definition choice. Candidates:

Move-number-based: opening = plies 1-20, middlegame = 21-60, endgame = 61+. Simple, deterministic, ignores actual position.

Material-based: endgame = total minor+major piece score ≤ N for both sides. Position-aware. Common in chess engines.

Move-and-material hybrid: opening = plies 1-20 OR queens still on home squares; endgame = material thresholds; else middlegame. Pick one for V1; document the choice.

Material count expression language

Cover: the syntax of material_count(expr). Generalizes v1’s --count "R=r" / "QBNqbn=0" / "P<p".

Expressions use single-letter piece codes (uppercase = white, lowercase = black) with comparison operators:

R=r — equal rook counts
P>p — white has more pawns
QBNqbn=0 — no minor or major pieces of either side
B>=2 — white has at least 2 bishops

Open question: is this its own mini-language, or do we lift it into the predicate library directly via material_count("R=r") being a PositionPred? Either works; the former is more compact.

Predicate authoring guide

Cover: process for adding a new predicate to the library.

Name and signature

Bitboard expression (must be branch-free and SIMD-friendly where possible)

Cost estimation (target: ≤20 cycles per evaluation for “cheap” position predicates; tactical / engine-evaluated can cost more)

Selectivity hint (rough fraction matching Ever(P) on Lichess corpus)

Test case (one query that exercises it end-to-end)

Doc entry following the spec format above