Monte-Carlo Pre-Roll Equity
For positions that are absent or too sparse in the games database to read an empirical win-rate,
MonteCarloEquity
estimates the pre-roll outcome distribution on demand:
val est = MonteCarloEquity.estimate(state, MonteCarloConfig(rollouts = 2000), new Random(seed))est.whiteWin // P(White captures a king first)est.blackWin // P(Black captures a king first)est.undecided // residual survival mass at the ply horizonThe three masses partition every rollout and sum to 1.0. “Pre-roll” means the estimate is for a
position before the dice are rolled — the same quantity analytics reads empirically — so it slots
directly into the doubling-cube equity guidance.
Why Rao-Blackwell
Section titled “Why Rao-Blackwell”A vanilla Monte-Carlo rollout plays a full random game and scores a 0/1 win at the end. Its
per-rollout variance is that of a Bernoulli variable: p·(1 - p), which needs many samples to
tighten.
The engine already computes, for any position, the exact probability that the side to move
captures a king on its upcoming roll — KingCaptureProbability
enumerates all 216 dice outcomes (56 weighted multisets). Instead of sampling a win/loss at each
node, we integrate that exact term along the rollout, weighted by the probability the game is
still alive:
survive = 1for each ply (side S to move): p = P(S captures the opponent king on this roll) // exact, over all 216 rolls winsOf(S) += survive * p survive *= (1 - p) advance to a random surviving continuation // a sampled non-capturing turnThis is Rao-Blackwellization: replacing a sampled indicator with its conditional expectation
can only reduce variance. The continuation is still sampled (via TurnGenerator
- the
RandomSearchpolicy), so the estimator stays unbiased to first order while the dominant win/loss mass at every node is added analytically.
EquityEstimate.varianceReductionVsVanilla reports the measured ratio mean·(1 - mean) / sampleVariance
— how many times smaller the per-rollout variance is than a vanilla 0/1 estimator with the same mean.
This is the self-check ported from the C++ reference; a value > 1 quantifies the win, and
+Infinity means the position is resolved exactly (zero sample variance — e.g. decided on the first
roll).
Conditioning the rollout on survival
Section titled “Conditioning the rollout on survival”The C++ reference advances the rollout to any sampled legal turn, including king-captures. A king
capture is terminal, so following it produces a king-less board that is then played on — a small
second-order bias. As the rules source of truth, this implementation instead conditions the
continuation on survival: it advances only through turns that do not capture a king (the event
with probability 1 - p that survive already tracks). If a sampled roll has no surviving
continuation it is re-rolled; the per-ply analytic terms — and therefore the variance self-check —
are unchanged.
Budgeting
Section titled “Budgeting”MonteCarloConfig controls cost:
rollouts— fixed sample count (and the hard cap when adaptive stopping is on).maxPlies— rollout horizon; survival mass still alive at the horizon is reported asundecided.targetError— when> 0, stop early once the White-win standard error reaches the target (afterminRollouts). Standard error scales as1/√rollouts, so halving it costs 4× the rollouts.
The dominant cost per ply is the exact king-capture term (56 weighted DFS probes), not the random
advance. MonteCarloEquityBenchmark tracks fixed-budget time (→ rollouts/sec) and time-to-target-CI
across positions, which decides where the estimator runs: server-side, cached, or compiled to WASM
for the browser.
Verification
Section titled “Verification”- Convergence: on a position where every one of the 216 rolls captures the king, the estimate is
exactly
1.0for the side to move (and itscolorFlipgives1.0for the other side). - Invariant:
whiteWin + blackWin + undecided == 1for every position. - Variance: the reference self-check (
varianceReductionVsVanilla > 1) on a sharp position. - Determinism: a fixed seed reproduces the estimate bit-for-bit; adaptive stopping halts at the
configured
minRolloutsonce the target error is met.