Skip to content

Monte-Carlo Pre-Roll Equity

For positions that are absent or too sparse in the games database to read an empirical win-rate, MonteCarloEquity estimates the pre-roll outcome distribution on demand:

val est = MonteCarloEquity.estimate(state, MonteCarloConfig(rollouts = 2000), new Random(seed))
est.whiteWin // P(White captures a king first)
est.blackWin // P(Black captures a king first)
est.undecided // residual survival mass at the ply horizon

The three masses partition every rollout and sum to 1.0. “Pre-roll” means the estimate is for a position before the dice are rolled — the same quantity analytics reads empirically — so it slots directly into the doubling-cube equity guidance.

A vanilla Monte-Carlo rollout plays a full random game and scores a 0/1 win at the end. Its per-rollout variance is that of a Bernoulli variable: p·(1 - p), which needs many samples to tighten.

The engine already computes, for any position, the exact probability that the side to move captures a king on its upcoming roll — KingCaptureProbability enumerates all 216 dice outcomes (56 weighted multisets). Instead of sampling a win/loss at each node, we integrate that exact term along the rollout, weighted by the probability the game is still alive:

survive = 1
for each ply (side S to move):
p = P(S captures the opponent king on this roll) // exact, over all 216 rolls
winsOf(S) += survive * p
survive *= (1 - p)
advance to a random surviving continuation // a sampled non-capturing turn

This is Rao-Blackwellization: replacing a sampled indicator with its conditional expectation can only reduce variance. The continuation is still sampled (via TurnGenerator

  • the RandomSearch policy), so the estimator stays unbiased to first order while the dominant win/loss mass at every node is added analytically.

EquityEstimate.varianceReductionVsVanilla reports the measured ratio mean·(1 - mean) / sampleVariance — how many times smaller the per-rollout variance is than a vanilla 0/1 estimator with the same mean. This is the self-check ported from the C++ reference; a value > 1 quantifies the win, and +Infinity means the position is resolved exactly (zero sample variance — e.g. decided on the first roll).

The C++ reference advances the rollout to any sampled legal turn, including king-captures. A king capture is terminal, so following it produces a king-less board that is then played on — a small second-order bias. As the rules source of truth, this implementation instead conditions the continuation on survival: it advances only through turns that do not capture a king (the event with probability 1 - p that survive already tracks). If a sampled roll has no surviving continuation it is re-rolled; the per-ply analytic terms — and therefore the variance self-check — are unchanged.

MonteCarloConfig controls cost:

  • rollouts — fixed sample count (and the hard cap when adaptive stopping is on).
  • maxPlies — rollout horizon; survival mass still alive at the horizon is reported as undecided.
  • targetError — when > 0, stop early once the White-win standard error reaches the target (after minRollouts). Standard error scales as 1/√rollouts, so halving it costs 4× the rollouts.

The dominant cost per ply is the exact king-capture term (56 weighted DFS probes), not the random advance. MonteCarloEquityBenchmark tracks fixed-budget time (→ rollouts/sec) and time-to-target-CI across positions, which decides where the estimator runs: server-side, cached, or compiled to WASM for the browser.

  • Convergence: on a position where every one of the 216 rolls captures the king, the estimate is exactly 1.0 for the side to move (and its colorFlip gives 1.0 for the other side).
  • Invariant: whiteWin + blackWin + undecided == 1 for every position.
  • Variance: the reference self-check (varianceReductionVsVanilla > 1) on a sharp position.
  • Determinism: a fixed seed reproduces the estimate bit-for-bit; adaptive stopping halts at the configured minRollouts once the target error is met.