Skip to content

Predicate Library

Status: TEMPLATE. Sections below are scaffolds for collaborative authoring. Each predicate entry will eventually carry its exact bitboard expression and its quantifier defaults; for now we’re sketching the catalog — what’s in and what’s not.

Implemented today: header predicates + a first slice of position predicates — queens_off, bishop_pair, doubled_pawn, isolated_pawn, passed_pawn, rook_on_seventh, rook_on_open_file, king_castled (side-parameterized), and material_count (the --count expr). The remaining entries below are target design. For what actually runs, see README → Status.

The chess-domain primitives the PREDICATE-LANGUAGE composes. Every named predicate here is a function on a single board state (a PositionPred) or a single header (a HeaderPred). The library grows over time as new chess concepts get use cases.

This is where the chess knowledge lives. The language is generic; the library is opinionated chess.

Cover: what every entry in this doc must include. Standardize the entry shape so library additions are auditable.

Each predicate entry is:

### name
**Type:** PositionPred | HeaderPred
**Signature:** name(param1, param2, ...) — parameter list
**Definition:** plain-English semantics
**Bitboard expression:** the exact C/C++ expression
**Cost:** approximate cycles per evaluation
**Selectivity hint:** approximate fraction of games matching `Ever(P)`
in a typical mixed corpus (advisory only — the
planner can refine empirically)
**Notes:** edge cases, related predicates, history

All header predicates evaluate in one shared pass over the 36-byte ScoreHeader, ~7 ms over 10M games regardless of count. They are always run first by the planner.

List with one-line definitions:

  • eco_in(range) — ECO tag in a closed range like B20-B99
  • eco_eq(code) — exact ECO match
  • result(W|B|D|decisive) — game result
  • termination(...) — termination reason (normal / time / abandoned / …)
  • white_elo_ge / le / between(...) — ELO bands, white
  • black_elo_ge / le / between(...) — ELO bands, black
  • both_elo_ge(N) — both players ≥ N
  • tc_category(...) — bullet / blitz / rapid / classical / correspondence
  • year_eq / year_in(range) — game date
  • white_name(...) / black_name(...) / played_by(...) — by player
  • min_ply / max_ply(N) — game length bounds

Open question: how do we name the “either-side” predicates cleanly? played_by(name) for “name is on either side”; white_name(name) / black_name(name) for specific side. Or any_player(name)?

Open question: do we want compound aliases here, like gm for both_elo_ge(2500) or master for both_elo_ge(2200)? Or do those live one layer up (the language frontend resolves the alias)?

The bitboard-native chess predicates. Each operates on a single Board and returns bool. Grouped by topic.

  • bishop_pair(side) — side has a bishop of each color AND the opponent does not (the bishop-pair advantage, not merely owning two bishops): (side.B & LIGHT) && (side.B & DARK) && !((opp.B & LIGHT) && (opp.B & DARK))
  • opposite_bishops — both sides have exactly one bishop, on opposite colors
  • same_color_bishops — both sides have exactly one bishop, on same color
  • queens_off — no queen on the board
  • two_minors(side)popcount(side.N | side.B) == 2
  • material_count(expr) — flexible count expression, generalizes v1’s --count
  • material_balance(score) — total material score for white minus black
  • equal_material — material balance == 0
  • pawn_count(side, op, n)popcount(side.P) op n

Open question: do we want a single flexible count(expr) primitive (v1-style "R=r", "QBNqbn=0") or named-shape primitives (equal_rooks, no_minors_no_majors)?

  • doubled_pawn(side, file?) — side has ≥2 pawns on a file (any file by default)
  • isolated_pawn(side, file?) — pawn with no friendly pawns on adjacent files
  • passed_pawn(side, file?) — pawn with no enemy pawns blocking on file or adjacent
  • backward_pawn(side, file?) — pawn unable to advance, no friendly support
  • pawn_chain(side, min_length) — diagonal chain of N pawns
  • locked_center — heuristic; pawns on d-/e-files of both sides immobile
  • open_file(file) — no pawns of either side on this file
  • half_open_file(side, file) — no friendly pawns; enemy pawns present
  • king_castled(side) — king has moved to a castled square
  • king_castled_kingside(side) — specifically kingside (g1 / g8)
  • king_castled_queenside(side) — specifically queenside (c1 / c8)
  • opposite_castled — sides castled on opposite wings
  • king_in_center(side) — king on its starting rank, central files
  • king_exposed(side) — heuristic; few friendly pawns adjacent to king
  • rook_on_seventh(side) — at least one rook on side’s 7th relative rank
  • rook_on_open_file(side) — at least one rook on a file with no pawns
  • rook_on_half_open(side) — at least one rook on side’s half-open file
  • knight_on_outpost(side, rank_floor) — knight on a square with no enemy pawn attack and where pawn support is in place
  • bishop_on_long_diag(side) — at least one bishop on a long diagonal
  • bishop_fianchetto(side) — bishop on g2/b2 (white) or g7/b7 (black)
  • knight_on_edge(side) — knight on a- or h-file
  • queen_developed(side) — queen has moved off its home square

Cover: these need engine help — they’re not pure bitboard pattern matches. They cost more per evaluation.

  • in_check(side) — side to move is in check
  • attacked_piece(side, type) — side has a piece of type under attack
  • hanging_piece(side, type) — attacked piece with no defender
  • pinned_piece(side) — side has a pinned piece
  • discovered_check_available(side) — side can play a discovered check

Open question: how deep into tactical territory does this library go? Predicates like mate_in_one are technically expressible but very expensive per evaluation. Do they live here, or in a separate “tactical- patterns” library that’s opt-in?

  • move_made(piece, from?, to?) — at this ply, a specific move was just played
  • capture_made — at this ply, the previous move was a capture
  • promotion_made(promo_type?) — at this ply, a pawn just promoted
  • castled(side) — at this ply, side just castled

Open question: these are move-of-the-ply predicates, distinct from position-state predicates. Different evaluation pattern. Same PositionPred type or a separate one?

Cover: definition of opening / middlegame / endgame for use with InPhase(P, phase).

Phase is one of: opening, middlegame, endgame. The classifier runs on each ply and returns the current phase.

Open question: definition choice. Candidates:

  • Move-number-based: opening = plies 1-20, middlegame = 21-60, endgame = 61+. Simple, deterministic, ignores actual position.
  • Material-based: endgame = total minor+major piece score ≤ N for both sides. Position-aware. Common in chess engines.
  • Move-and-material hybrid: opening = plies 1-20 OR queens still on home squares; endgame = material thresholds; else middlegame. Pick one for V1; document the choice.

Cover: the syntax of material_count(expr). Generalizes v1’s --count "R=r" / "QBNqbn=0" / "P<p".

Expressions use single-letter piece codes (uppercase = white, lowercase = black) with comparison operators:

  • R=r — equal rook counts
  • P>p — white has more pawns
  • QBNqbn=0 — no minor or major pieces of either side
  • B>=2 — white has at least 2 bishops

Open question: is this its own mini-language, or do we lift it into the predicate library directly via material_count("R=r") being a PositionPred? Either works; the former is more compact.

Cover: process for adding a new predicate to the library.

  • Name and signature
  • Bitboard expression (must be branch-free and SIMD-friendly where possible)
  • Cost estimation (target: ≤20 cycles per evaluation for “cheap” position predicates; tactical / engine-evaluated can cost more)
  • Selectivity hint (rough fraction matching Ever(P) on Lichess corpus)
  • Test case (one query that exercises it end-to-end)
  • Doc entry following the spec format above