Transport (spec)
Status: draft v0.2 (2026-05-17; op-log persistence + log replay
- session-seed protocol) Prerequisite reading: op vocabulary v0.6, identity and trust v0.2 Subsequent work: transport implementation refresh; multi-tab dev harness.
What changed in v0.2
Section titled “What changed in v0.2”Three concrete additions, all motivated by the identity v0.2 trust model (specifically: enabling anonymous-with-link sessions where the owner may be offline when joiners arrive):
- Relay holds per-session op log. Previously the relay was a pure forwarder with no state beyond connected peers. v0.2 adds an in-memory op log per session, so late joiners can sync without requiring an online peer who happens to hold the history. This is a meaningful expansion of the relay’s responsibility (it now holds data, not just connections), but a narrow one — the relay still doesn’t verify signatures, sign anything, or interpret op content.
- Session-seed protocol. The first peer to claim a fresh session
ID seeds the session’s metadata (rootKeys, startFen) via an
extended HELLO message. Subsequent peers receive the seeded metadata
in their WELCOME. This eliminates the v0.1 assumption that the
relay’s
sessionMetaResolverhad pre-knowledge of every session. - LOG_REPLAY message type. A new joiner whose log is empty (or significantly behind) requests a replay from the relay. The relay streams all stored ops back, ordered by arrival. The joiner applies them through the standard pipeline (signature + chain verification on every op). Replaces the v0.1 “peer-to-peer snapshot exchange” for the offline-host scenario.
What’s unchanged: the Transport interface, both topologies (star and mesh), the reliable-ordered-delivery requirement, JSON canonical wire format, membership awareness, watermark gossip, op-set summary gossip.
The pull-only snapshot exchange from v0.1 (§4.5) is preserved but demoted: it’s still useful for online-host scenarios (snapshot is more compact than full log replay) and remains in §4.5. For offline- host scenarios, LOG_REPLAY (§4.10) is the primary mechanism.
Background
Section titled “Background”The op-vocabulary and identity specs define what gets sent on the wire (signed ops, trust state) and how each peer validates what it receives (verify signature, check trust at op-HLC, validate op type). They’re silent on how the wire works: how peers find each other, how messages get delivered, how state syncs, how membership is tracked, how late joiners catch up. That’s this spec.
Reminder of scope: the wire carries OpenFile ops, snapshots, and signaling messages — runtime machinery for a live session. The persistent artifact a session produces is a PGN (see README). The relay’s op log buffer is bounded; once a session’s effects are captured in a snapshot, ops below the watermark are GC’d. Wire formats here are operational, not archival.
OpenFile supports two distinct deployment topologies:
- Star (relay-based). A central server forwards messages between peers. Production deployments, large sessions, broadcast scenarios.
- Mesh (P2P). Peers connect directly via WebRTC after a brief signaling handshake. Hostless small-group sessions, “share-a-link- no-server” deployments.
Apps choose. Both topologies use the same wire protocol, the same ops, the same identity model, the same persistence formats. The Transport interface abstracts the difference. The op layer doesn’t know or care which is underneath.
The protocol is deliberately small. WebRTC and WebSocket already solve framing, ordering, reliability, and channel security. This spec specifies what we layer on top of them: the application protocol (op envelopes, membership events, snapshot exchanges, watermark gossip, rejection notifications).
§1 Foundational decisions
Section titled “§1 Foundational decisions”1.1 Two topologies, one Transport interface
Section titled “1.1 Two topologies, one Transport interface”OpenFile supports both star and mesh. They’re real product modes with distinct use cases (see Background); we don’t choose one and defer the other.
The mechanism is a Transport interface that the op layer consumes,
implemented by two reference impls (WebSocket for star, WebRTC for
mesh). Apps instantiate the right transport for their deployment;
the rest of the stack is identical.
const target = createOpenFileTarget({ transport: createWebSocketTransport({ url: 'wss://relay.example.com/...' }), // OR: transport: createWebRTCTransport({ signalingUrl: 'wss://signaling.example.com/...' }), // ... other options (rootKeys, signer, verifier, etc.) are identical});The interface (§3) is small: send / receive ops, observe membership, exchange snapshots, manage connection lifecycle. Both impls satisfy the same contract; custom impls (in-process bus for tests, IPC between Electron windows, anything else) plug in identically.
1.2 Reliable ordered delivery required
Section titled “1.2 Reliable ordered delivery required”The Transport interface mandates reliable, ordered, eventually- delivered semantics for ops. Both reference impls provide this:
- WebSocket — TCP-backed; reliable + ordered by default.
- WebRTC data channel — configured with
{ ordered: true, maxRetransmits: undefined }(mimics TCP).
Why mandate at the interface: the op-layer’s apply pipeline assumes ops arrive in some sane order. Out-of-order delivery is buffered gracefully via causal-validity rules (op-vocab §4.3), but reliable delivery is required — a permanently-lost op breaks convergence. Both transports underneath are reliable; the interface enforces it explicitly so custom transports can’t surprise the op layer.
(Note: HLC ordering of ops is independent of transport ordering. Ops can arrive out of HLC order; the buffer + apply machinery handles that. What we require is that each op eventually arrives.)
1.3 JSON canonical wire format
Section titled “1.3 JSON canonical wire format”All wire messages use canonical JSON per RFC 8785 (the same encoding mandated by the identity spec for signing). Implications:
- Op envelopes on the wire are byte-identical to op envelopes at rest (persistence spec §10.6). One canonicalization, three contexts.
- Wire framing is line-delimited JSON (for WebSocket text frames) or length-prefixed JSON (for WebRTC binary mode); both are implementation choices that don’t affect content.
- Non-op messages (membership events, snapshot exchanges) are also
canonical JSON, with a
typediscriminator field.
v1 ships JSON only. Binary canonical form (CBOR or similar) is reserved for v2 if profiling demands it — see op-vocab §12 #1.
1.4 Strong membership awareness
Section titled “1.4 Strong membership awareness”The Transport interface exposes strong membership — every peer knows the current set of connected peers in the session. Required because:
- Mesh requires it. A peer broadcasting an op in mesh topology must send to every other peer individually; without a peer list, there’s nowhere to send.
- Star benefits from it. The relay already tracks who’s connected; exposing this to peers enables UI affordances (“Alice and Bob are here”).
- Future-proofs presence. A v2+ presence feature (peer cursors, “Alice is looking at move 47”) layers naturally on top of membership events.
Cost is small: the Transport emits peer-join / peer-leave events; peers maintain a Set locally.
1.5 Pull-only snapshot exchange
Section titled “1.5 Pull-only snapshot exchange”When a new peer joins a long-running session, they need to catch up. Two models considered:
- Push — existing peers (or the relay) proactively send snapshots to joiners.
- Pull — joiners request a snapshot from a chosen peer at a chosen HLC.
v1 commits to pull. Reasons:
- Topology-agnostic. Star and mesh both support it identically: the joiner picks a peer (relay or any other peer) and requests.
- Simpler error handling. If a snapshot request fails, the joiner retries against a different peer. Push would require the sender to know whether the receiver got it.
- Bandwidth-controlled. Joiner decides when to fetch (e.g., after initial connection is healthy); push would require the sender to manage outbound timing.
Push is a v2 optimization if profiling shows pull is too slow for many simultaneous joiners. Until then, pull.
§2 Layer boundaries
Section titled “§2 Layer boundaries”2.1 What this spec owns
Section titled “2.1 What this spec owns”- Transport interface — the contract that the op layer consumes.
- Wire protocol — message types, envelope structure, ordering rules between message types.
- Two reference Transport impls — WebSocket and WebRTC.
- Reference signaling server — minimal NAT-traversal endpoint for WebRTC topology.
- Membership protocol — peer-join / peer-leave events, peer presentation of session secrets.
- Session-seed protocol (v0.2) — first peer to claim a sessionId seeds metadata via HELLO; relay stores it for future joiners.
- Per-session op log at the relay (v0.2) — stored bytes, not interpreted content. Enables log replay without a peer being online to provide a snapshot.
- Log replay protocol (v0.2) — bounded streaming of stored ops to joiners.
- Snapshot exchange protocol — pull request / response between peers (preserved from v0.1; still useful for online-host cases).
- Causal stability protocol — watermark gossip; coordinates with op-layer’s GC machinery.
- Gossip protocol — periodic op-set summaries for censorship-by- withholding detection.
- Rejection notifications — when receivers reject ops, the emitter learns (with appropriate information-leak care).
- Sync mode enforcement — at the transport layer, ensure
spectator/follow-leaderpeers don’t broadcast local ops.
2.2 What the op layer + identity layer own
Section titled “2.2 What the op layer + identity layer own”(For clarity — these belong to other specs, not this one.)
- Op vocabulary, validation, apply rules — op-vocab spec.
- HLC clock, ordering, tie-breaks — op-vocab spec.
- Op signing and verification — identity spec.
- Trust state derivation — identity spec.
- Persistence formats — op-vocab spec §10.
This spec consumes all of the above. We assume each peer’s op layer + identity layer are working correctly; the transport’s job is to get bytes from one peer’s emit to other peers’ receive.
2.3 What apps own
Section titled “2.3 What apps own”- Deployment topology choice — pick the transport at construction.
- Relay server operation (for star deployments) — hosting, scaling, persistence.
- Signaling server operation (for mesh deployments) — hosting the minimal NAT-traversal endpoint, or reusing a public one.
- Session metadata distribution — root keys, secrets, URLs. The link / invite payload that brings peers together.
- Display name resolution — same as identity spec.
- UI affordances — connection-status indicators, “X is offline,” reconnect buttons.
- Persistence wiring — where the bytes go, per op-vocab §10.
2.4 What’s deferred to v2+
Section titled “2.4 What’s deferred to v2+”- Bandwidth optimization — delta encoding, op batching beyond natural framing. v2 if profiling demands.
- Multi-relay federation — star deployments with multiple relays that gossip among themselves. Today: one relay per session.
- Selective sync — “send me only ops affecting this subtree.” Not a chess-collab need; ops are small enough to ship everything.
- NAT-traversal fallback to TURN servers — STUN-only signaling works for most home networks; corporate networks behind strict NATs may need TURN relay. App / signaling-server policy.
- Push-based snapshot distribution — see §1.5.
- Encrypted application-level payloads — ops sent over already- encrypted TLS / DTLS channels. Encrypting inside (so the relay can’t read content) is a v2+ privacy concern.
§3 The Transport interface
Section titled “§3 The Transport interface”The contract the op layer consumes:
interface Transport { // ── Lifecycle ───────────────────────────────────────────────────
/** Establish the underlying connection (WebSocket, WebRTC, etc.). * Resolves when ready to send/receive. */ connect(): Promise<void>;
/** Tear down the connection. Idempotent. */ disconnect(): void;
/** Current connection state. */ state: 'idle' | 'connecting' | 'connected' | 'reconnecting' | 'closed';
/** Subscribe to state transitions. */ onState(handler: (state: TransportState) => void): () => void;
// ── Op exchange ─────────────────────────────────────────────────
/** Broadcast a signed op to all connected peers (subject to * sync-mode policy — see §10). */ send(op: SignedOp): void;
/** Subscribe to ops arriving from peers. */ onOp(handler: (op: SignedOp) => void): () => void;
// ── Membership ──────────────────────────────────────────────────
/** Currently-connected peers (excluding self). */ getPeers(): PeerInfo[];
/** Subscribe to peer-join events. */ onPeerJoin(handler: (peer: PeerInfo) => void): () => void;
/** Subscribe to peer-leave events. */ onPeerLeave(handler: (peerPublicKey: string) => void): () => void;
// ── Snapshot exchange ───────────────────────────────────────────
/** Request a snapshot from a peer at a given HLC. Resolves with * the snapshot bytes; rejects on timeout, refused, peer-gone. */ requestSnapshot( fromPeer: string, opts?: { atHlc?: number; timeoutMs?: number }, ): Promise<Uint8Array>;
/** Subscribe to incoming snapshot requests. The handler must * produce snapshot bytes (typically by calling target.exportSnapshot) * or reject. */ onSnapshotRequest( handler: (req: SnapshotRequest) => Promise<Uint8Array>, ): () => void;
// ── Watermark gossip ────────────────────────────────────────────
/** Broadcast this peer's current local-applied watermark to peers. * Other peers' transports collect these; the op layer computes * the session-wide minimum and advances its GC watermark * accordingly. */ sendWatermark(localHlc: number): void;
/** Subscribe to incoming watermark gossip. */ onWatermark( handler: (peer: string, theirHlc: number) => void, ): () => void;
// ── Op-set gossip (censorship defense) ──────────────────────────
/** Periodically — typically every N ops or T seconds — * exchange compact op-set summaries with peers. Discrepancies * surface when a peer is missing ops they should have. See §11. * Optional: transports without gossip return a no-op * implementation; identity spec's threat-model already notes * censorship-by-withholding as an out-of-protocol concern. */ sendOpSetSummary?(summary: OpSetSummary): void; onOpSetSummary?( handler: (peer: string, summary: OpSetSummary) => void, ): () => void;
// ── Rejection notifications ─────────────────────────────────────
/** When this peer receives an op and rejects it (per identity * spec §4.3), optionally notify the emitting peer so their UX * can surface "your op was rejected." */ sendRejection?(toPeer: string, rejection: RejectionInfo): void; onRejection?(handler: (rej: RejectionInfo) => void): () => void;}
type PeerInfo = { publicKey: string; // identity-layer public key transportId: string; // transport-level peer ID (separate) joinedAt: number; // local timestamp of join};
type SnapshotRequest = { fromPeer: string; atHlc?: number; // requested snapshot HLC; undefined = current requestId: string; // for response correlation};
type OpSetSummary = { // Implementation choice: Bloom filter, Merkle tree root, sorted // opId list per author, etc. Both peers must implement the same // summary format to compare. Reference impls use per-author // (maxSeq, hash-of-applied-opIds-up-to-maxSeq). See §11. perAuthor: { [author: string]: { maxSeq: number; hash: string } };};
type RejectionInfo = { opId: { author: string; seq: number }; reason: 'invalid-signature' | 'untrusted-author' | 'invalid-op' | 'illegal-move'; // Note: 'invalid-signature' is information-sensitive; see §13.};That’s the whole interface. Custom transports (test mocks, in-process buses, anything else) implement exactly this contract; the op layer behaves identically regardless of what’s underneath.
§4 Wire protocol
Section titled “§4 Wire protocol”Messages exchanged between peers (or via a relay) over the underlying
transport. All messages are canonical JSON with a type
discriminator.
4.1 Message types
Section titled “4.1 Message types”| Type | Direction | Purpose |
|---|---|---|
hello | Peer → relay/peer | Initial handshake on connect |
welcome | Relay/peer → joiner | Handshake response with session metadata |
op | Peer → all peers | Broadcast a signed op |
peer-join | Relay/signal → peers | New peer joined the session |
peer-leave | Relay/signal → peers | Peer disconnected |
snapshot-request | Peer → peer | Ask for state snapshot at HLC |
snapshot-response | Peer → peer | Snapshot bytes (or refusal) |
log-replay-request (v0.2) | Peer → relay | Request stored op log replay |
log-replay-chunk (v0.2) | Relay → peer | Streamed op (one per chunk) |
log-replay-end (v0.2) | Relay → peer | End-of-stream marker |
watermark | Peer → all peers | Causal-stability gossip |
op-set-summary | Peer → all peers | Op-set summary for gossip protocol |
rejection | Peer → emitter | ”I rejected your op N for reason X” |
Each message carries type and a messageId (UUID) for response
correlation. Some types carry additional fields per their semantics.
4.2 Op message
Section titled “4.2 Op message”{ "type": "op", "messageId": "uuid", "op": { /* signed op envelope per identity spec §3.1 */ }}The op field is the full signed op — the same bytes used at rest
(persistence spec §10) and the same canonicalized payload used for
signing.
Broadcast semantics:
- In star topology, peers send
opto the relay; the relay forwards to all other connected peers. - In mesh topology, peers send
opdirectly to each connected peer (or via a chosen routing strategy — see §6.3).
The relay (in star) does NOT verify signatures or check trust. It’s a dumb pipe. Verification happens at each endpoint per identity spec §4.2. This keeps the relay simple and means a compromised relay can drop or reorder ops but can’t forge them.
4.3 Handshake (hello / welcome) — extended in v0.2
Section titled “4.3 Handshake (hello / welcome) — extended in v0.2”When a peer connects, it sends hello. v0.2 extends the v0.1 form
with an optional seedSessionMeta for the first-peer-to-claim-a-
session-id case:
{ "type": "hello", "messageId": "uuid", "publicKey": "MCowBQ...", // joiner's identity public key "sessionId": "session-abc-123", // which session to join "sessionSecret": "...", // for link-based join (optional) "seedSessionMeta": { // v0.2 — optional; populates a "rootKeys": ["MCowBQ..."], // fresh session if relay has "startFen": "rnbqkbnr/..." // no record of this sessionId yet }, "protocolVersion": 1}Session-seed protocol (v0.2). When the relay receives a HELLO for a sessionId it has no record of:
- If
seedSessionMetais present, the relay stores it as the session’s metadata for future joiners. The seeding peer becomes the bootstrapping owner. - If
seedSessionMetais absent, the relay responds withwelcomecarryingsessionMeta: null. The joining peer can either:- Wait for someone else to seed (uncommon; usually means misconfiguration), OR
- Disconnect with a session-not-found error.
When the relay receives a HELLO for a sessionId it already knows:
seedSessionMeta, if present in the new HELLO, is ignored. The seeded metadata is immutable for the session’s lifetime.
Welcome response:
{ "type": "welcome", "messageId": "uuid", "inReplyTo": "uuid-of-hello", "sessionMeta": { "rootKeys": ["MCowBQ...root1"], "startFen": "rnbqkbnr/..." }, "currentPeers": [ { "publicKey": "...", "transportId": "...", "joinedAt": 12345 } ], "watermark": 1730412345000123, "logSize": 142, // v0.2 — number of ops in the relay's log "protocolVersion": 1}The joiner now has:
- Session metadata — needed to validate root delegations.
- Current peer list — needed for mesh broadcast routing.
- Current watermark — so the joiner knows what HLC range to expect.
- v0.2: log size — lets the joiner decide whether to request full log replay or a snapshot, based on cost.
After welcome, the joiner typically requests either a snapshot
(§4.5, online-host scenario) or a log replay (§4.10, offline-host
scenario, also fine for fresh joiners). The standard pattern for the
multi-tab / anonymous-with-link UX is log replay — it doesn’t require
any peer to be online.
4.4 Peer-join / peer-leave
Section titled “4.4 Peer-join / peer-leave”The relay (star) or the signaling/discovery mechanism (mesh) emits these to existing peers when membership changes:
{ "type": "peer-join", "messageId": "uuid", "peer": { "publicKey": "MCowBQ...", "transportId": "...", "joinedAt": 12345 }}
{ "type": "peer-leave", "messageId": "uuid", "peerPublicKey": "MCowBQ..."}Peers update their local membership Sets on each event. In mesh, peer-join also typically triggers a new WebRTC connection negotiation between the new peer and the existing peer (see §6.3).
4.5 Snapshot-request / snapshot-response
Section titled “4.5 Snapshot-request / snapshot-response”{ "type": "snapshot-request", "messageId": "uuid", "atHlc": 1730412345000123, "requestedBy": "MCowBQ..."}
{ "type": "snapshot-response", "messageId": "uuid", "inReplyTo": "uuid-of-request", "snapshot": { /* per op-vocab §10.3 */ }, "tail": [ /* signed ops with hlc > snapshot.generatedAt */ ]}The respondent (typically the relay in star, or any peer in mesh)
calls target.exportSnapshot() and optionally target.exportOpLog({ sinceHlc: snapshot.generatedAt }) to assemble the tail. The joiner
hydrates via createOpenFileTarget({ snapshot, opLogTail }).
If a peer can’t fulfill a snapshot request (e.g., they don’t have state at the requested HLC, or they’re under load), they respond with:
{ "type": "snapshot-response", "messageId": "uuid", "inReplyTo": "uuid-of-request", "error": "no-snapshot-at-hlc" | "load-shedding" | ...}The joiner retries against a different peer.
4.6 Watermark gossip
Section titled “4.6 Watermark gossip”{ "type": "watermark", "messageId": "uuid", "fromPeer": "MCowBQ...", "hlc": 1730412345000123}Periodically (every N ops or T seconds — implementation choice),
each peer broadcasts its current applied-watermark to peers. Other
peers track the per-peer values. The session-wide watermark for GC
is min(observed peer watermarks) — once that minimum advances, ops
below it are GC-eligible (op-vocab §6.2).
4.7 Op-set summary
Section titled “4.7 Op-set summary”{ "type": "op-set-summary", "messageId": "uuid", "fromPeer": "MCowBQ...", "summary": { "perAuthor": { "MCowBQ...alice": { "maxSeq": 47, "hash": "..." }, "MCowBQ...bob": { "maxSeq": 92, "hash": "..." } } }}Periodic exchange (less frequent than watermarks — every minute, say). Discrepancies between peer summaries surface missing ops; see §11 for the resolution protocol.
4.8 Rejection notification
Section titled “4.8 Rejection notification”{ "type": "rejection", "messageId": "uuid", "toPeer": "MCowBQ...", "opId": { "author": "...", "seq": 7 }, "reason": "untrusted-author"}Sent to the emitting peer when their op was rejected. See §13 for which reasons are safe to share and which aren’t.
4.9 Wire format versioning
Section titled “4.9 Wire format versioning”The protocolVersion field in hello / welcome declares the
protocol version. v1 = 1. If peers’ versions don’t match:
- Relay (or first-contact peer) responds with
welcomecarryingerror: "version-mismatch". Joiner displays “this app version isn’t compatible with the session.” - Future versions may support backward compatibility via downgrade negotiation. v1 just fails cleanly.
Schema changes within v1 are additive only (new optional fields, new message types). Breaking changes bump to v2.
4.10 Log replay (v0.2)
Section titled “4.10 Log replay (v0.2)”A new joiner whose state is empty (or far behind) requests stored ops from the relay. Unlike snapshot exchange (§4.5), log replay doesn’t depend on any peer being online — the relay serves from its own buffer.
Request:
{ "type": "log-replay-request", "messageId": "uuid", "fromHlc": null, // null = from beginning; or an HLC for incremental "requestedBy": "MCowBQ..."}Response stream. The relay sends one log-replay-chunk per op,
in arrival order (which is approximately HLC order but the receiver’s
causal buffer handles out-of-order anyway):
{ "type": "log-replay-chunk", "messageId": "uuid", "inReplyTo": "uuid-of-request", "seqInReplay": 0, // 0-indexed; lets receiver detect drops "op": { /* signed op */ }}Followed by:
{ "type": "log-replay-end", "messageId": "uuid", "inReplyTo": "uuid-of-request", "totalSent": 142, "watermark": 1730412345000123 // relay's current watermark}Applying the replay. The joiner applies each op through the
standard pipeline — signature verification, chain walk, capability
check, chess-data validation. Same code path as receiving live ops.
The causal buffer handles out-of-order arrival.
Live ops during replay. Ops broadcast to the session while the
replay is in flight are also delivered to the joiner via the normal
op channel. The joiner’s apply pipeline deduplicates by opId, so
overlap with the replay is harmless.
Replay vs snapshot — when to use which.
- Replay is the default for the anonymous-with-link / multi-tab case. Doesn’t require an online peer. Cost: O(ops) bytes; for long-lived sessions this can be large.
- Snapshot (§4.5) is the optimization for online-host scenarios where another peer can serve a compact snapshot of derived state. Snapshots compress repeated mutations to a single per-register entry. Cost: smaller, but requires an online provider.
Apps can choose. The reference WebSocket client (transport implementation) defaults to: try snapshot first if other peers exist, fall back to log replay if no peers or snapshot request times out.
Incremental replay. The fromHlc field supports resume — a peer
that disconnects and reconnects can request only ops with hlc > lastSeen. The relay filters its log accordingly.
Log GC at the relay. The relay’s op log can be pruned for ops below the session’s watermark — those ops are causally settled and can’t be needed by any future joiner (snapshot would carry them in derived form). Pruning policy is implementation-defined; reference implementation keeps everything in v0.2 and adds policy in v0.3.
§5 WebSocket Transport (reference impl for star)
Section titled “§5 WebSocket Transport (reference impl for star)”The reference implementation for star topology. Peers connect to a WebSocket relay server; the relay forwards messages between peers.
5.1 Client behavior
Section titled “5.1 Client behavior”const transport = createWebSocketTransport({ url: 'wss://relay.example.com/session/abc-123', onError: (err) => ..., reconnect: { enabled: true, maxAttempts: 10, backoffMs: 1000 },});The client:
- Opens a WebSocket to the URL.
- Sends
helloimmediately on open; awaitswelcome. - Receives all subsequent messages and dispatches per type:
op→onOphandlerspeer-join/peer-leave→ updatepeersset, fire handlerssnapshot-response→ resolve the pendingrequestSnapshotpromise matching the request IDsnapshot-request→ invoke the consumer’s snapshot handler; send responsewatermark→ fireonWatermarkhandlersrejection→ fireonRejectionhandlers
- Sends outbound messages by serializing canonical JSON + ws.send().
5.2 Relay server protocol — v0.2
Section titled “5.2 Relay server protocol — v0.2”The relay is a forwarding switch with a per-session op log, not a logic server. Per session (URL path or query param), it maintains:
- A connection map:
publicKey → WebSocket - Session metadata (rootKeys, startFen) — seeded by the first peer’s HELLO; immutable thereafter
- An ordered list of every signed op broadcast through the session — the session op log. Used to serve log-replay-requests from joiners. The relay never inspects op content.
- A current watermark — heuristically updated as ops flow through.
For each connection:
- On open, await
hello. ValidatesessionId. - Session seeding (v0.2):
- If the relay has no record of this
sessionIdANDhello.seedSessionMetais present, store the metadata. The seeding peer is the bootstrapping owner. - If the relay has a record, the existing metadata is authoritative;
ignore any
seedSessionMetain this HELLO. - If the relay has no record AND no
seedSessionMetawas supplied, respond withwelcomecarryingsessionMeta: nulland let the client decide how to proceed (typically disconnect).
- If the relay has no record of this
- Send
welcomewith sessionMeta, current peer list, watermark, and log size (v0.2). - Broadcast
peer-jointo all existing peers in this session; add this peer to the map. - For each subsequent message from this peer:
op→ append to session op log AND forward to all OTHER peers in this session. Update watermark heuristic.log-replay-request(v0.2) → stream all ops in the session log (filtered byfromHlcif supplied) aslog-replay-chunkmessages, ending withlog-replay-end. See §4.10.snapshot-request→ forward to a chosen peer (random, least-loaded). The relay does NOT serve snapshots itself — that requires interpreting op content.snapshot-response→ forward to the originally-requesting peerwatermark→ forward to all OTHER peersrejection→ forward to the targeted peer
- On disconnect, broadcast
peer-leave; remove from map. Session state (op log, metadata) is retained as long as the session has at least one peer connected OR within a configurable TTL of the last peer’s disconnect.
The relay does not:
- Verify signatures (verification happens at endpoints per identity spec v0.2 §4)
- Walk delegation chains (chain verification happens at endpoints)
- Interpret op content (the relay never parses chess data or trust state)
- Modify message contents (canonical JSON is preserved byte-for-byte)
- Issue trust grants on anyone’s behalf (option 1 from the architectural discussion is explicitly NOT taken; the relay is a storage authority, not a trust authority)
- Hold a link table (short-token → credentials) for app-style share URLs. URL scheme is app territory — see identity v0.2 §8.5. Apps that want Lichess-style short URLs build that on their own app server, not on the OpenFile relay.
The v0.2 expansion is narrow: the relay now holds bytes, not meanings. The trust model is still anchored in cryptographic delegation chains rooted at the session’s seeded rootKeys; the relay can withhold or reorder ops (a denial-of-service vector) but cannot forge authority.
Durability. v0.2 reference relay keeps the op log in memory. Production relays should persist to disk for crash resilience (SQLite, append-only file, etc.). The interface (log-replay streaming) is identical regardless of storage backend; persistence is a deployment choice.
Bounded buffers. A relay with finite memory needs an eviction policy. Reference implementation v0.2: unbounded (development / small sessions). Production policy: prune ops below the session’s watermark, periodically request a peer-provided snapshot to compress state. v0.3 spec will codify a recommended policy.
5.3 Reconnection
Section titled “5.3 Reconnection”When a connection drops:
- The client transitions to
'reconnecting'state, firesonState. - It retries with exponential backoff (initial 1s, doubling up to cap, jitter to avoid thundering herd).
- On successful reconnect, it sends
helloagain. The relay treats this as a fresh join (broadcastspeer-join). - The reconnected peer requests a fresh snapshot from a peer to catch up on missed ops.
(A reconnect-with-resume feature could carry “last-seen HLC” in hello, letting the relay ship just the ops since then. v2 optimization; v1 keeps it simple via snapshot.)
§6 WebRTC Transport (reference impl for mesh)
Section titled “§6 WebRTC Transport (reference impl for mesh)”The reference implementation for mesh topology. Peers connect to a signaling server to discover each other and exchange ICE candidates, then establish direct WebRTC data channels and communicate peer-to-peer thereafter.
6.1 Signaling phase
Section titled “6.1 Signaling phase”The signaling server is a dumb message relay for connection setup only (see §7 for its full spec). For each peer:
- Peer connects to signaling server (WebSocket).
- Peer sends
hellocarrying its public key + sessionId. - Signaling server adds peer to session’s connected list; sends
welcomewith current peers. - For each existing peer in the session, the new peer initiates a WebRTC PeerConnection: creates offer, sends offer via signaling, awaits answer, exchanges ICE candidates via signaling.
- Once a WebRTC data channel opens to the existing peer, the signaling channel for that pair is no longer needed. (The signaling connection itself stays open to learn about new joiners.)
The signaling server never sees ops, snapshots, or anything else. It only carries WebRTC setup messages and membership events.
6.2 Data channel setup
Section titled “6.2 Data channel setup”WebRTC data channels are configured for OpenFile’s needs:
const channel = peerConnection.createDataChannel('openfile', { ordered: true, maxRetransmits: undefined, // unlimited; mimics TCP reliability protocol: 'openfile-v1',});This gives reliable + ordered delivery, matching the WebSocket transport’s semantics.
6.3 Mesh maintenance
Section titled “6.3 Mesh maintenance”After signaling, each peer has direct WebRTC connections to every other peer in the session — a full mesh. To broadcast an op, the peer sends it on every channel.
Connection count: N×(N-1)/2 for N peers. Acceptable for small sessions (≤10 peers). For larger meshes, apps should switch to star topology.
Topology changes:
- A new peer joining triggers WebRTC negotiations with every existing peer. The signaling server brokers them in parallel.
- A peer leaving (clean disconnect): each remaining peer sees the
WebRTC connection close and emits
peer-leavelocally. - A peer leaving (unclean — network drop): WebRTC detects via heartbeat/timeout; behaves the same.
Watermark and gossip messages are sent on every data channel (same broadcast pattern as ops).
Snapshot requests are sent to a single chosen peer (random, least-recently-used, or app-policy). If that peer fails to respond, retry against a different one.
6.4 NAT traversal failure modes
Section titled “6.4 NAT traversal failure modes”WebRTC’s NAT traversal works for ~85-90% of network conditions but fails on:
- Symmetric NATs (most corporate networks)
- Some carrier-grade NATs
- Restrictive firewalls
For these cases, options are:
- TURN server (relay over UDP/TCP through a stable IP). v2+ feature — apps configure their own TURN servers if needed.
- Fallback to WebSocket transport — if WebRTC connection establishment fails for too long, the app prompts the user to reconnect via a relay. Requires the app to operate one; not automatic in v1.
v1 reports the failure via onState('closed') with an error reason;
apps surface this in UI. NAT-traversal reliability is a known
trade-off for mesh; star is the production answer when reliability
across all network conditions is required.
§7 Reference signaling server
Section titled “§7 Reference signaling server”A minimal NAT-traversal endpoint for WebRTC topology. Scope: just enough to get two browsers’ WebRTC connections to establish. Nothing more.
7.1 What it does
Section titled “7.1 What it does”- Accepts WebSocket connections from peers.
- Groups peers by
sessionId. - Forwards WebRTC setup messages (offer, answer, ICE candidates) between peers in the same session.
- Emits peer-join / peer-leave events to existing peers when membership changes.
7.2 What it doesn’t do
Section titled “7.2 What it doesn’t do”- See ops, snapshots, or any OpenFile content.
- Track session state beyond “which peers are connected.”
- Verify identity (peer key validation happens at the data-channel level after handshake).
- Persist anything (it’s all in-memory; restart = sessions reset).
- Authenticate users (anyone can connect; if the session uses a
secret, peers present it in their
hello). - Match-make (“find me an opponent”). Out of scope; apps build that themselves.
7.3 Wire protocol
Section titled “7.3 Wire protocol”The signaling server is its own tiny wire protocol — separate from the OpenFile transport wire protocol. Messages:
{ "type": "hello", "publicKey": "...", "sessionId": "..." }{ "type": "welcome", "peers": [...] }{ "type": "peer-join", "peer": {...} }{ "type": "peer-leave", "peerPublicKey": "..." }{ "type": "offer", "fromPeer": "...", "toPeer": "...", "sdp": "..." }{ "type": "answer", "fromPeer": "...", "toPeer": "...", "sdp": "..." }{ "type": "ice", "fromPeer": "...", "toPeer": "...", "candidate": "..." }These match WebRTC’s signaling needs precisely. Once peers have WebRTC data channels, the signaling server’s job for that pair is done.
7.4 Sample implementation
Section titled “7.4 Sample implementation”A reference implementation in ~150 lines of Node.js ships in the
OpenFile repo under reference-servers/. Apps either:
- Deploy it as-is (Cloudflare Workers, Heroku, anything that runs Node).
- Modify it (add auth, rate limits, observability).
- Reimplement from this spec (Go, Rust, Python — straightforward).
- Use a public signaling service (if one exists with compatible protocol).
The signaling server is stateless except for the in-memory peer map. It can scale horizontally with a shared session store (Redis, etc.) — that’s app territory.
§8 Membership and join flow
Section titled “§8 Membership and join flow”8.1 The flow, end to end — v0.2
Section titled “8.1 The flow, end to end — v0.2”For both star and mesh, joining a session looks like:
Fresh-session creator path (the first peer in a session):
- Generate session ID (random UUID or similar) and owner keypair.
- Connect to transport endpoint.
- Send
hellowith publicKey, sessionId, ANDseedSessionMeta: { rootKeys: [ownerPubKey], startFen }. - Receive
welcomeconfirming the seeded metadata. - Construct OpenFileTarget with the sessionMeta + ownerSigner. The target auto-emits the root delegation as op #0 (identity v0.2 §1.7). This op is broadcast and stored in the relay’s op log.
- Begin emitting and receiving ops.
Joining-existing-session path (joining via link or invite):
- Parse the URL hash for sessionId + link credentials per identity v0.2 §7.
- Generate a fresh tab keypair (K_tab).
- Connect to transport endpoint.
- Send
hellowith publicKey=K_tab, sessionId. NoseedSessionMeta(the session already exists). - Receive
welcomewith sessionMeta and watermark and log size. - Send
log-replay-request(v0.2) — receives all ops from the relay’s session log, including the root delegation and any intermediate delegations. - Construct OpenFileTarget with the received sessionMeta (no ownerSigner). Apply each replayed op through the standard pipeline (signature + chain verification).
- Emit per-tab delegation signed by linkSk, granting K_tab the
link’s capability bundle. Broadcast via
op. - Begin emitting and receiving chess ops, signed by K_tab.
The joining peer is now a full participant (or scoped per the link’s capability bundle).
8.2 Trust bootstrap on join — v0.2
Section titled “8.2 Trust bootstrap on join — v0.2”The v0.1 “trust store with addTrustedKey ops” model is replaced by the v0.2 delegation-chain model. The bootstrap question becomes: “how does K_tab earn a delegation chain to a session root?”
Closed-invite session. The owner has pre-emitted a delegate op naming the joiner’s pubkey as audience. The joiner already has the delegation in the log they replay; their ops verify against the chain immediately.
Open-with-link / Powerline session (anonymous-with-link). The URL contains a link keypair. The joiner generates K_tab and uses the link’s private key to sign a per-tab delegation: K_tab inherits the link’s capabilities (or a strict subset).
The relay never participates in the trust decision. It simply stores and forwards the delegation op; existing peers verify it via the same pipeline used for any other op.
Truly-hostless / future P2P. When no relay is online, peers exchange delegations via direct WebRTC. The same envelope and chain walk apply.
See identity v0.2 §7 (Powerline) and §8 (per-link generation lifecycle) for the cryptographic details of these flows.
8.3 Clean vs unclean leaves
Section titled “8.3 Clean vs unclean leaves”Clean — peer calls transport.disconnect(). Transport closes
the WebSocket / WebRTC connections; relay (or signaling) emits
peer-leave to other peers.
Unclean — connection drops (network failure, browser crash). Detection:
- WebSocket: server’s onclose event fires after the heartbeat times out (~30-60s).
- WebRTC: data channel’s
iceConnectionStatetransitions todisconnectedthenfailed.
Either way, peer-leave is emitted to remaining peers. The departed
peer’s already-applied ops stay; only future ops are blocked (which
is moot if the peer is gone).
§9 Snapshot distribution
Section titled “§9 Snapshot distribution”9.1 Pull request / response
Section titled “9.1 Pull request / response”The joiner requests a snapshot from a chosen peer (or the relay):
joiner ──snapshot-request{atHlc: W}──→ chosen-peerchosen-peer ──snapshot-response{snapshot, tail}──→ joinerThe response includes:
- A snapshot at HLC ≤ atHlc.
- The op-log tail from snapshot’s HLC to the responder’s current HLC.
The joiner hydrates via createOpenFileTarget({ snapshot, opLogTail }). Once hydrated, the joiner’s HLC and trust state match the
responder’s view as of the response time.
9.2 Choosing a snapshot source
Section titled “9.2 Choosing a snapshot source”In star: typically the relay (if it maintains a snapshot cache), or the host peer if the relay doesn’t. Apps configure which.
In mesh: any connected peer. Strategies:
- Pick the longest-online peer (most likely to have a comprehensive view).
- Pick a random peer (load distribution).
- App-configurable.
If the chosen source fails (timeout, refused, returns error), retry against another. If all sources fail, the joiner enters a “waiting for snapshot” state and the app surfaces a “connection problem” UI.
9.3 Snapshot freshness
Section titled “9.3 Snapshot freshness”The joiner’s snapshot reflects the responder’s state at response time. Ops that arrived at other peers BEFORE the snapshot was generated but AFTER the responder’s view caught up are included. Ops that arrive at the responder AFTER snapshot generation but DURING the response transit are in the tail.
After hydration, the joiner is “live” — receiving fresh ops from
peers. Ops with hlc > snapshot.generatedAt arriving from anyone
are applied normally; the snapshot got them to a consistent baseline
from which divergence is impossible (per HLC monotonicity).
§10 Sync mode enforcement
Section titled “§10 Sync mode enforcement”The op-vocab spec defines three sync modes (§9.7): collaborative,
follow-leader, spectator. The transport enforces them on the
emit side:
collaborative— transport’ssend(op)actually transmits.follow-leader— transport’ssend(op)is a local no-op. The op is applied locally (so the local view is consistent) but never reaches the wire.spectator— same as follow-leader: local-only.
Receive side is unchanged: ops from configured upstream / any peer arrive and apply per the sync-mode acceptance rules.
The transport doesn’t need to know HOW the op was signed (some modes use ephemeral keys for local-only ops, per identity spec). It just routes — or doesn’t — based on construction-time policy.
§11 Gossip protocol (censorship defense)
Section titled “§11 Gossip protocol (censorship defense)”The identity spec notes that signatures don’t protect against bad receivers (peers who drop legitimate ops). This section provides the optional mitigation: peers periodically exchange op-set summaries; discrepancies surface drops.
11.1 Summary format
Section titled “11.1 Summary format”For each peer (other than self), the summary lists per-author
(maxSeq, hash):
{ "perAuthor": { "MCowBQ...alice": { "maxSeq": 47, "hash": "h0..." }, "MCowBQ...bob": { "maxSeq": 92, "hash": "h1..." } }}Where maxSeq is the highest seq this peer has applied from author
and hash is a deterministic hash of the applied-set of author’s
ops up to maxSeq (e.g., XOR of opId hashes, or a Merkle root).
11.2 Detection
Section titled “11.2 Detection”When peer A receives B’s summary:
- For each author K in both summaries:
- If
A[K].maxSeq < B[K].maxSeq, A is behind on K’s ops; request them from B (regular snapshot or targeted op replay). - If
A[K].hash != B[K].hashdespite same maxSeq, divergence — one of A or B has a different applied-set, indicating selective drops or applied-set divergence.
- If
The detection is approximate (Bloom filters and hash-summaries have false negatives) but catches systematic drops.
11.3 Resolution
Section titled “11.3 Resolution”On detected discrepancy:
- The lagging peer requests missing ops via snapshot or targeted op-replay.
- Both peers continue normal operation; convergence resumes once catch-up completes.
If discrepancy persists across multiple gossip rounds with the same peer, that peer may be the bad actor. App-level policy decides: disconnect, flag, audit. The op layer doesn’t punish; it just surfaces.
11.4 Optional in v1
Section titled “11.4 Optional in v1”This is an optional feature on the Transport interface (§3). The
reference WebSocket transport may or may not include it (lean: skip
in v1; relay-based deployments rarely face censorship since the
relay sees all). The reference WebRTC transport may include a
minimal version (mesh deployments lack a central witness and benefit
more from gossip).
v2 may upgrade to required and standardize the summary format.
§12 Rejection notifications
Section titled “§12 Rejection notifications”Per identity spec §4.3, receivers can reject ops for various reasons. The transport optionally notifies the emitter.
12.1 Which reasons to share
Section titled “12.1 Which reasons to share”| Reason | Notify? | Why |
|---|---|---|
invalid-signature | No | Sharing leaks crypto failure detail; potential side channel for attackers. |
untrusted-author | Yes | Useful UX (“you’re not authorized in this session”). |
invalid-op (illegal SAN, etc.) | Yes | Useful for debugging emitter bugs. |
replay / idempotence-dup | No | Silent dedup is correct behavior. |
below-watermark | Yes | The emitter is too far behind; “fetch snapshot” UX. |
invalid-signature rejections are silent because confirming “your
signature was invalid” leaks information about cryptographic state
that attackers could probe. All other reasons are safe to share.
12.2 Wire format
Section titled “12.2 Wire format”Per §4.8.
12.3 Aggregation
Section titled “12.3 Aggregation”If multiple peers reject the same op, the emitter may receive multiple notifications. The transport doesn’t deduplicate; apps present a “consensus” rejection UI if multiple peers reject (vs. “one peer’s network issue” if only one does).
§13 Failure modes
Section titled “§13 Failure modes”13.1 Connection drops
Section titled “13.1 Connection drops”- WebSocket: TCP drop, server crash, client browser tab backgrounded for too long.
- WebRTC: ICE failure, network change, restrictive NAT, peer crash.
In both cases:
- Transport transitions to
reconnecting. - Outbound queue holds pending ops.
- On reconnect, ops in queue are flushed; a fresh snapshot is fetched to catch up on missed inbound ops.
13.2 NAT-traversal failure (mesh)
Section titled “13.2 NAT-traversal failure (mesh)”A peer can’t establish a WebRTC connection to another peer due to NAT restrictions:
- Connection attempt times out.
- Transport surfaces
onState('closed')with reason. - App may prompt the user to retry, or recommend switching to a relay-backed deployment.
13.3 Peer crash without clean disconnect
Section titled “13.3 Peer crash without clean disconnect”Detection via WebSocket heartbeat (server-managed timeout) or WebRTC ICE failure. Treated as unclean leave.
13.4 Relay crash (star)
Section titled “13.4 Relay crash (star)”All clients lose their WebSocket. They enter reconnecting and
retry. Once the relay is back, they reconnect and re-sync. If the
relay was the only snapshot authority and didn’t persist, late
joiners can’t catch up until other peers reconnect (since they hold
state in-memory).
For production deployments: the relay SHOULD persist (write op log to a database). Apps that don’t need durability can skip persistence; those that do treat the relay as the source of truth.
13.5 Signaling server crash (mesh)
Section titled “13.5 Signaling server crash (mesh)”Existing WebRTC connections are unaffected (they’re peer-to-peer; the signaling server isn’t in the data path). New joiners can’t connect until the signaling server is back, but established peers continue normally.
This is a real architectural strength of mesh: the signaling server is only critical during connection establishment.
13.6 Network partition
Section titled “13.6 Network partition”If peers split into disjoint groups by network failure, each group continues independently. Ops emitted in group A don’t reach group B. On reconnection (partition heals), peers exchange watermarks and op-set summaries; missing ops in either direction are requested. Eventually-consistent merge resumes.
The CRDT design (op-vocab) handles this naturally — no special partition-tolerance code required at the transport layer.
§14 Sample integrations
Section titled “§14 Sample integrations”14.1 Star deployment: tournament broadcast
Section titled “14.1 Star deployment: tournament broadcast”import { createOpenFileTarget } from 'openfile';import { createWebSocketTransport } from 'openfile/transport-ws';import { createWebCryptoSigner, createWebCryptoVerifier } from 'openfile/crypto';
const myKeypair = await loadKeypairFromAccount();const sessionMeta = await fetch('/api/session/' + tournamentId).then(r => r.json());
const target = createOpenFileTarget({ transport: createWebSocketTransport({ url: `wss://relay.tournaments.example.com/session/${tournamentId}`, publicKey: myKeypair.publicKey, sessionId: tournamentId, }), rootKeys: sessionMeta.rootKeys, signer: createWebCryptoSigner(myKeypair), verifier: createWebCryptoVerifier(), syncMode: 'spectator', // viewers can't broadcast});
await target.transport.connect();const game = createGameFromTarget(target, { onTree, onCursor });Spectator-mode peers receive moves as the players make them; their local analysis stays local. The relay’s only job is forwarding; authentication, signature verification, and trust derivation are all client-side.
14.2 Mesh deployment: ad-hoc study
Section titled “14.2 Mesh deployment: ad-hoc study”import { createOpenFileTarget } from 'openfile';import { createWebRTCTransport } from 'openfile/transport-webrtc';import { generateEphemeralKeypair, createWebCryptoSigner, createWebCryptoVerifier } from 'openfile/crypto';
// Parse session from link.const { rootKey, secret, signalingUrl } = parseLink(window.location);
// Fresh ephemeral keypair for this browser.const myKeypair = await generateEphemeralKeypair();
const target = createOpenFileTarget({ transport: createWebRTCTransport({ signalingUrl, publicKey: myKeypair.publicKey, sessionId: linkSessionId, sessionSecret: secret, // for trust bootstrap }), rootKeys: [rootKey], signer: createWebCryptoSigner(myKeypair), verifier: createWebCryptoVerifier(), syncMode: 'collaborative',});
await target.transport.connect();const game = createGameFromTarget(target, { onTree, onCursor });Two browsers, no server (besides the public signaling server for NAT traversal). Once WebRTC connects, the signaling server is out of the path. Closing both browsers ends the session (no persistence by default; apps can wire IndexedDB if they want resume).
§15 What’s deliberately NOT in the spec
Section titled “§15 What’s deliberately NOT in the spec”- Application-level identity / authentication — identity spec.
- Op signing / verification details — identity spec.
- Op vocabulary, validation rules, apply machinery — op-vocab spec.
- Persistence formats — op-vocab spec §10.
- TURN server configuration — app/deployment policy.
- Relay scaling / sharding — deployment territory.
- Matchmaking — app-level product, not transport.
- Presence features — peer cursor sharing, “Alice is here” UX. Layered above this spec; uses membership events as substrate.
- Channel encryption beyond TLS/DTLS — WebSocket-over-WSS and WebRTC-over-DTLS provide channel security. End-to-end encryption inside (so the relay can’t read content) is v2+.
§16 Open questions
Section titled “§16 Open questions”-
Reconnect-with-resume vs. snapshot-on-reconnect. §5.3 says “fetch a fresh snapshot on reconnect.” A reconnect-with-resume feature carries last-seen HLC in
hello; relay ships just the ops since then. Optimization; v2 if profiling shows reconnect is too slow. -
Gossip-protocol summary format. §11.1 sketches per-author maxSeq + hash. Bloom filters would be more compact for large per-author seq ranges; trade-off is false-positive complications. Defer to implementation; reference impls pick one and document.
-
Relay-as-authority snapshot vs peer-as-authority. §9.2 leaves this app-configurable. Best-practice guidance: in star, relays that persist serve snapshots; in mesh, any peer. Worth documenting patterns more concretely as deployment guides accumulate.
-
Cross-relay federation — multiple relays for one session, gossiping among themselves. v2+ for very-large-session apps.
-
Backwards-compatible protocol evolution. v1 fails cleanly on version mismatch (§4.9). When v2 lands, do we support v1 ↔ v2 downgrade negotiation? Probably yes; the cost is small and the adoption story is much better.
Resolved during v0.1 design
Section titled “Resolved during v0.1 design”- Topology choice → both, behind a pluggable Transport interface (§1.1).
- WebSocket vs WebRTC → both, as two reference impls.
- Wire format → JSON canonical (matches signing format from identity spec).
- Reliability semantics → reliable + ordered required at the Transport interface; both reference impls provide it (§1.2).
- Membership awareness → strong, baked into the Transport interface (§1.4).
- Snapshot exchange → pull only (§1.5).
- Signaling server scope → bare minimum, NAT-traversal only (§7.2).
- Relay verification responsibilities → none. Relay is a forwarder; verification at endpoints (§5.2).
§17 What’s next
Section titled “§17 What’s next”After this spec is locked:
- Reference WebSocket transport — client + relay server. ~300 LOC client, ~200 LOC server.
- Reference signaling server — for WebRTC. ~150 LOC Node.js.
- Reference WebRTC transport — client. ~500 LOC.
- Integration with OpenFileTarget — wire the Transport interface into the op layer’s emit / receive pipeline.
- Connection-lifecycle UX — reconnect, snapshot-on-rejoin.
- Gossip protocol — optional v1, recommended for mesh.
- Sample apps — the §14 examples as runnable demos.
Steps 1-4 are required for any networked OpenFile deployment. Steps 5-7 land progressively as real consumers materialize.
Appendix: protocol version 1 message reference
Section titled “Appendix: protocol version 1 message reference”For quick reference, all v1 message types and their fields:
// Sent peer → relay/peertype Hello = { type: 'hello'; messageId: string; publicKey: string; sessionId: string; sessionSecret?: string; protocolVersion: 1;};
// Sent relay/peer → joinertype Welcome = { type: 'welcome'; messageId: string; inReplyTo: string; sessionMeta: { rootKeys: string[]; startFen: string }; currentPeers: PeerInfo[]; watermark: number; protocolVersion: 1; error?: 'version-mismatch' | 'session-not-found' | 'auth-failed';};
// Sent peer → all peerstype OpMessage = { type: 'op'; messageId: string; op: SignedOp;};
// Sent relay/signal → peerstype PeerJoin = { type: 'peer-join'; messageId: string; peer: PeerInfo;};
type PeerLeave = { type: 'peer-leave'; messageId: string; peerPublicKey: string;};
// Sent peer → peertype SnapshotRequest = { type: 'snapshot-request'; messageId: string; atHlc: number; requestedBy: string;};
type SnapshotResponse = { type: 'snapshot-response'; messageId: string; inReplyTo: string; snapshot?: SnapshotData; tail?: SignedOp[]; error?: 'no-snapshot-at-hlc' | 'load-shedding' | 'refused';};
// Sent peer → all peerstype Watermark = { type: 'watermark'; messageId: string; fromPeer: string; hlc: number;};
// Sent peer → all peers (optional)type OpSetSummary = { type: 'op-set-summary'; messageId: string; fromPeer: string; summary: { perAuthor: { [author: string]: { maxSeq: number; hash: string } } };};
// Sent peer → emitter (optional)type Rejection = { type: 'rejection'; messageId: string; toPeer: string; opId: { author: string; seq: number }; reason: 'invalid-signature' | 'untrusted-author' | 'invalid-op' | 'below-watermark';};