Kelly Criterion - The Mathematics of Position Sizing
By EC Assets Research Team, Risk Strategy · Published · Updated
Kelly Criterion — The Kelly Criterion is a mathematical formula for position sizing that maximises the geometric growth rate of capital. It provides the optimal bet size given the probability of winning and the ratio of potential gain to potential loss.
Definition
The Kelly Criterion is a mathematical formula for determining the optimal size of a series of bets. It was derived in 1956 by John Kelly Jr., a researcher at Bell Labs, in a paper titled "A New Interpretation of Information Rate." The original context was information theory and the maximum rate at which information could be transmitted through noisy channels; the mathematical framework turned out to apply to gambling and financial position sizing as well.
The criterion answers a specific question: given a sequence of independent bets with known win probability and payoff, what fraction of bankroll should you bet on each opportunity to maximise the long-run geometric growth rate of capital?
The answer, Kelly proved, is the formula f = (bp - q) / b, where b is the net odds, p is the win probability, and q = 1 - p is the loss probability. This single formula determines the position size that produces the highest expected logarithm of wealth over many bets.
Why Geometric Growth Matters
Kelly's insight was that long-run wealth follows geometric (not arithmetic) growth. A sequence of gains and losses compounds; the order matters. Consider three sizing approaches for a 55%-favourable coin flip with 1:1 payoff:
| Bet size | After 100 flips, median wealth | Maximum drawdown |
|---|---|---|
| 5% (under-Kelly) | $1.30 multiplier | ~25% |
| 10% (full Kelly) | $1.75 multiplier | ~50% |
| 20% (over-Kelly) | $1.40 multiplier | ~75% |
| 30% (way over-Kelly) | $0.80 multiplier (loss) | ~90% |
The 10% (full Kelly) sizing produces the highest long-run growth. Above 10% the volatility drag dominates, reducing growth despite the same expected per-bet return. Below 10% is leaving growth on the table. The non-monotonic relationship is structural to compounding.
The Fractional Kelly Compromise
[!warning] Full Kelly is mathematically optimal but practically aggressive. The maximum drawdowns it produces (50%+ in typical setups) exceed what most institutions can operationally sustain. The structural reason: full Kelly produces volatility commensurate with potential growth. Most practitioners use fractional Kelly (0.25x to 0.5x of full Kelly) which captures 75-95% of the long-run growth at much lower drawdown risk. Half-Kelly is the most common professional choice. The trade-off: small additional growth from full Kelly is not worth the operational risk for institutions with stakeholder reporting and risk limits.
Application to Financial Markets
The Kelly framework extends beyond binary gambles to continuous distributions. For financial positions with normally-distributed returns, the formula becomes:
f* = μ / σ²
Where μ is the expected excess return and σ² is the variance of returns. This is the same as the Merton portfolio rule from continuous-time finance - Kelly and Merton derive the same formula from different starting points.
Famous applications:
Edward Thorp at Princeton-Newport Partners. Applied Kelly principles to convertible bond arbitrage from 1969-1988. Achieved 19% annual returns over two decades with explicit Kelly-style position sizing. Documented in "Beat the Market" (1967) and "A Man for All Markets" (2017).
Renaissance Technologies. Jim Simons' firm uses Kelly principles in position sizing across its quantitative strategies. The Medallion fund's sustained outperformance combines superior signal generation with disciplined Kelly-style sizing.
Bill Gross at PIMCO. Wrote extensively about Kelly-influenced bond portfolio sizing in PIMCO's research papers.
Practical Implementation
| Step | Action | Common pitfall |
|---|---|---|
| 1. Estimate probability of edge | What's the win rate of this trade? | Overestimation due to backtest optimism |
| 2. Estimate payoff ratio | What's gain vs loss in winning vs losing? | Ignoring tail risk in losing scenarios |
| 3. Calculate full Kelly | f* = (bp - q) / b | Mathematical precision masks input uncertainty |
| 4. Apply fractional Kelly | Multiply by 0.25-0.5 | Choosing fraction subjectively |
| 5. Adjust for portfolio context | Combine with other positions | Treating each position independently |
| 6. Monitor and adjust | Update probability estimates with new data | Anchoring on initial estimates |
The single most common implementation error is overestimating edge. Backtests systematically overstate future returns due to overfitting and selection bias. Most realised win rates in live trading turn out to be 5-15 percentage points below backtested win rates. Sizing must account for this estimation error.
Common Misconceptions
"Kelly is optimal in all situations." Only with accurate inputs and many independent bets. With one-shot decisions, asymmetric information, or correlated outcomes, Kelly may not be optimal.
"Higher Kelly = more aggressive = better." Above full Kelly, long-run growth decreases. Over-betting is mathematically suboptimal, not aggressive-but-better.
"Kelly says I should max out my position." Only if you have very high edge and very favourable odds. For most institutional opportunities, Kelly sizing is 5-15% of capital, not 50% or 100%. The intuition that "if I'm confident I should size big" leads to consistent over-sizing relative to Kelly-optimal.
References
- Jorion, P. (2006). Value at Risk (3rd ed.). McGraw-Hill.
- McNeil, A. J., Frey, R., & Embrechts, P. (2015). Quantitative Risk Management (2nd ed.). Princeton University Press.
- CFA Institute. Risk Management. CFA Program Curriculum.
Frequently asked questions
What is the simplest example of Kelly?
A coin flip with 55% probability of winning and 1:1 payoff. Full Kelly = (1 × 0.55 - 0.45) / 1 = 10%. So Kelly says bet 10% of capital on each flip. Over many flips, this maximises geometric growth rate. Above 10% produces lower long-run growth despite higher short-run upside.
Why do practitioners use fractional Kelly?
Three reasons. First, probability estimates are uncertain — full Kelly assumes you know the true probability, which is rarely accurate. Second, full Kelly produces very large drawdowns (50%+ are common). Third, financial markets don't have discrete outcomes like coin flips; the math approximations matter. Half-Kelly captures most of the long-run growth at materially lower drawdown risk.
Has Kelly been applied successfully in financial markets?
Yes. Edward Thorp applied it to convertible bond arbitrage at Princeton-Newport Partners (1969-1988), achieving sustained outperformance with explicit Kelly-style sizing. Renaissance Technologies reportedly uses Kelly principles in its position sizing. Bill Gross at PIMCO used variations. The framework is foundational to professional position sizing across multiple strategies.
What is the danger of Kelly?
Over-estimation of edge. The formula's sensitivity to probability estimates is structural. If you estimate 55% probability but the true probability is 52%, full Kelly would have you bet 10% (optimal for 55%) when you should bet 4% (optimal for 52%). The 6-percentage-point oversizing compounds catastrophically over many bets.
Stay informed
Market commentary, firm news and research from EC Assets - direct to your inbox.