Giving The Odds On The NCAA Tournament (Round 1 Edition)
| 11
|
by user Davis21wylie
JOIN OUR MARCH MADNESS BRACKET. CLICK HERE
It's officially my favorite time of year. Every March, the sports world experiences a rebirth, as spring training signals the end of a dreary winter and the beginning of a brand new season... But beyond baseball, March is one of the best sports months for another obvious reason: the NCAA Basketball Tournament. Oh, how we love filling out our brackets, and investing ourselves emotionally in every underdog from George Mason to Weber State, in the hope that someday -- somehow -- we'll achieve our ultimate fantasy: the perfect bracket... All 64 matchups correctly picked. 'Tis a consummation devoutly to be wished!
But, alas, no one can confidently predict the NCAA tournament with 100% accuracy. As a matter of fact, the odds of nailing every single game would be 1 in 9,223,372,036,854,775,808! But the good news is that we can improve those odds with a little know-how.
A dude named Ken Pomeroy happens to possess a lot of said know-how, and he has put it to good use all season long. A fellow devotee of Dean Oliver's work (along with myself, of course), Ken has based his power rankings around the simple (yet surprisingly resisted in basketball circles) concept that per-possession efficiency is the number-one indicator of team strength. Under this paradigm, team offense and defense should not be measured in terms of points per game (because the pace at which a team plays can heavily distort per-game numbers), but rather per possession. This allows us to compare offenses and defenses outside of the influence of pace, giving us a better idea of which areas are truly a team's strength or weakness.
In college basketball, strength of schedule is also a major factor -- putting up a certain level of offensive efficency in the Patriot League is clearly not the same as putting it up in the ACC. So Pomeroy adjusts his team offensive and defensive ratings to reflect this reality, accounting for the the strength of opposing defenses and offenses, and scaling his numbers to reflect efficiency versus an average D1 opponent at a neutral site.
From there, one can employ Bill James' handy pythagorean formula to establish an expected winning % for every team. The only difference between James' theorem for baseball and the one Pomeroy uses for college hoops is the exponent used in the equation -- instead of 2, Pomeroy has found that an exponent of 11.5 creates winning percentages that predict future outcomes best:
Pyth% = (Off. Efficiency^11.5)/((Off. Efficiency^11.5) + (Def. Efficiency^11.5))
The results are KenPom's power rankings, the most theoretically-sound means of predicting future games available for free to the public. And since the ratings are in the form of winning percentages, we can actually use them to create win probabilities for any game in the NCAA Tournament! Cool, no?
Just like I did in this article, the tool we'll be using for creating these probabilities is another Bill James invention: the Log5 method. Basically, the Log5 takes the winning percentages of each team in a matchup, and plugs them into this formula:
A - A * B
WPct = -----------------
A + B - 2 * A * B
The resulting percentage is the odds that Team A will beat Team B. Since it relies only on probability theory (meaning that it is not baseball-specific), we can apply James' formula to college basketball as well, using our expected winning percentages as A & B. The results should give us a pretty decent idea of which matchups are all but certainties (Kansas vs. the play-in winner, for example), which ones are too close to call (Kentucky-Villanova), and which ones are upsets ripe for the calling (Creighton over Nevada, anyone?).
So, enough talk. Let's commence with the odds for Round 1 of the NCAA Tournament...
For reference's sake, "Pyth" is the team's expected winning % (based on the pythagorean formula), "AdjO" and "AdjD" are their offensive and defensive efficiencies (adjusted for strength of competition), "Cons" is the team's consistency (the standard deviation of scoring difference per game for a team), and "Luck" is the deviation in winning percentage between a team’s actual record and their expected record using the correlated gaussian method.
The most telling indicator of team strength is obviously Pyth, which takes all of the information available on a team (offensively, defensively, rebounding, SOS, etc.) and synthesizes it into a single "True Winning %". But while the other two categories are not explicitly factored into the odds, they can also give valuable insight into a matchup. If the odds are reasonably close (i.e., the favorite's % is <70%) and the favorite is inconsistent, it might signal upset potential. In other words, highly rated teams that are inconsistent tend to look beatable more often. The least consistent contender (1 or 2 seeds) is easily Florida, meaning that some team down the line (Maryland, perhaps?) could readily pull an upset thanks to the Gators' up-and-down nature.
Luck is another key factor to keep in mind, albeit not as important as Pyth or Cons. If a team has been lucky, it probably means that they were overseeded in the tournament. It also means that, sooner or later, their luck will turn on them -- probably in a critical Sweet 16 or Elite 8 matchup. Of all the contenders, Ohio State has been the luckiest, outperforming their expected record by almost two full wins.
A lucky season + bad consistency could mean an early exit for a team, especially in matchups where the odds are less than overwhelming. Despite their high seedings, Kansas and Wisconsin have mixed inconsistency with unusually good luck all season long, perhaps foreshadowing tournament disappointment. The most ironclad contenders by this method are North Carolina and UCLA, each of whom have stayed relatively consistent in the face of bad luck. Southern Illinois also scores well by this method, meaning that they could be a dark-horse in the West bracket.
Well, that's all for now. After Round 1 is completed, I'll post Round 2 odds. As always, thanks for reading!

