Baseball Sharpe Ratio
| 17
|
by user Marcvsxiv
As a preface, I am hardly even a recreational sabermatrician, so the idea below may be neither novel nor reasonable.
The Yankees early exit from the playoff once again demonstrated the disconnect between regular season performance and playoff success. This got me thinking about the math underlying this disconnect.
In predicting success in the playoffs based on regular season performance, the naive strategy would be something like: Compute a team's historical run differential (RD) -- the number of runs scored minus the number of runs against. The team with the higher run differential is likely to win. Based on my quick back-of-excel calculations (using stats from http://www.retrosheet.org/), this method would have predicted the winner in 8 of the 14 playoff series in 2004 and 2005, which is not much better then a coin flip (the sample size is obviously too small for any real conclusion)
The question is: can this be improved?
The most obvious difference, I think, between the regular season and playoffs, is, simply, sample size. Over the course of a 162 game season, numbers average out. In a best of 5 or 7 game series, well, anything can happen.
What this argues for, then, is that volatility, which gets averaged out during the long regular season, factors in much more strongly in the playoffs. To determine a team's likelihood of playoff success, what is needed is a measure which rewards consistency and punishes volatility.
In finance, there is one such, rather popular, measure: The Sharpe Ratio. When evaluating an asset, its Sharpe is the ratio of the expected (mean) excess return to the risk/volatility (measured in standard deviation) of those returns. The Sharpe Ratio, thus, rewards higher expected returns and lower volatility.
Runs, of course, are the baseball parallel to returns. A baseball team's Sharpe Ratio would be its average number of runs divided by the standard deviation of those runs. The other complication when applying this to baseball is that there are two dimensions to account for -- Runs Scored and Runs Against. A team would therefore have two Sharpe Ratios, a Runs Scored Sharpe (RSS) and a Runs Against Sharpe (RAS).
A naive stab at using this measure would be to compute a Sharpe Run Differential (SRD) -- the RSS minus the RAS. One straightforward wrinkle is that since having different starters on a day to day basis might exaggerate the pitching volatility, it makes sense to lump a small number of games together. In my back of excel calculations, I chose groups of three mainly because 162 is evenly divisible by three and six seemed too large. In other words, instead of 162 observations, standard deviation was calculated on 54 (= 162 / 3 ) observations.
What turned up seems reasonably promising. The SRD method would have predicted, assuming I did all my calculations right, 11 out of the 14 series (it would have missed NYY/MIN '04, HOU/ATL '04 and HOU/STL '05). I've included the numbers below for perusal/validation:
| Year | Team | League | R | STDEV | RSS | RA | STDEV | RAS | RD | SRD |
|---|---|---|---|---|---|---|---|---|---|---|
| 2005 | ANA | AL | 14.093 | 5.564 | 2.533 | 11.907 | 5.278 | 2.256 | 0.728 | 0.277 |
| 2005 | ATL | NL | 14.241 | 5.099 | 2.793 | 12.481 | 4.737 | 2.635 | 0.586 | 0.158 |
| 2005 | BOS | AL | 16.852 | 5.427 | 3.105 | 14.907 | 5.540 | 2.691 | 0.648 | 0.414 |
| 2005 | CHA | AL | 13.722 | 4.752 | 2.888 | 11.944 | 5.293 | 2.257 | 0.593 | 0.631 |
| 2005 | HOU | NL | 12.722 | 5.145 | 2.473 | 11.204 | 5.339 | 2.098 | 0.506 | 0.374 |
| 2005 | NYA | AL | 16.407 | 5.995 | 2.737 | 14.611 | 5.757 | 2.538 | 0.599 | 0.199 |
| 2005 | SDN | NL | 12.667 | 6.171 | 2.053 | 13.444 | 5.087 | 2.643 | -0.259 | -0.590 |
| 2005 | SLN | NL | 14.907 | 4.763 | 3.130 | 11.741 | 5.126 | 2.291 | 1.056 | 0.839 |
| 2004 | ANA | AL | 15.481 | 6.291 | 2.461 | 13.593 | 5.903 | 2.303 | 0.630 | 0.158 |
| 2004 | ATL | NL | 14.870 | 5.103 | 2.914 | 12.370 | 5.307 | 2.331 | 0.833 | 0.583 |
| 2004 | BOS | AL | 17.574 | 5.493 | 3.199 | 14.222 | 5.765 | 2.467 | 1.117 | 0.732 |
| 2004 | HOU | NL | 14.870 | 6.168 | 2.411 | 12.926 | 4.971 | 2.600 | 0.648 | -0.189 |
| 2004 | LAN | NL | 14.093 | 4.779 | 2.949 | 12.667 | 4.762 | 2.660 | 0.475 | 0.289 |
| 2004 | MIN | AL | 14.444 | 5.541 | 2.607 | 13.241 | 5.248 | 2.523 | 0.401 | 0.084 |
| 2004 | NYA | AL | 16.611 | 6.806 | 2.441 | 14.963 | 5.821 | 2.570 | 0.549 | -0.130 |
| 2004 | SLN | NL | 15.833 | 5.372 | 2.947 | 12.204 | 5.078 | 2.403 | 1.210 | 0.544 |
Two final notes:
There are obvious gaps in this method, the most evident being that high vol is good for large underdogs.
Secondly, an alternative to the SRD would be a Run Differential Sharpe (RDS) - the mean run differential divided by the standard deviation of the run differentials. I'll try running that tommorow (along with more years of playoffs).
