Win-rate inference

Lesson 0001 Big-picture map The whole method on one page — four contributions and how every module maps to a claim.
Lesson 0002 The estimand & why ψ travels Why the win-rate is the probabilistic index, and why that identity is the paper's leverage.
Lesson 0003 Inference under crossed dependence Why naive SEs are wrong when comparisons share prompts, judges, and reused responses.
Lesson 0004 Prompt-cancellation & standardization Why the difference-form PIM can't adjust for prompt mix — and the standardized-marginal fix.
Lesson 0005 Pairwise → leaderboard Rank-confidence sets and transitivity diagnostics: report what the data actually resolve.
Lesson 0006 Identification & the seams to Paper 2 Where identification holds, where it strains, and the openings that become the next paper.
Lesson 0007 Capstone: defending the leaderboard Put it together — defend every number against a skeptical reviewer.
Lesson 0008 The standardized-marginal argument for Jean The noncollapsibility argument, made airtight.
Lesson 0009 From ψ to the leaderboard score, and how we linearize it The influence-function derivation behind the inference primitive.
Reference Glossary All symbols and terms in one place.