Skip to main content

March Madness Metrics: Quantifying the Madness

It has been almost two full months since the 2021-22 college basketball came to a close with the Kansas Jayhawks' come from behind victory over the North Carolina Tar Heels in the National Championship game. But data does not have an off-season.

So far this spring, we have examined the NCAA Tournament resumes of the top college head coaches over the past 40 years. We have found that just based on the raw numbers, Michigan State University Head College Tom Izzo is among the best. When it comes to performance compared to expectation, Coach Izzo is the best of all time.

We also explored and quantified the difficulty of both the draw and actual path of several notable teams over the past 20 tournaments. Interestingly, the data suggests that the 2022 Jayhawks had the easiest path to a Title of any team since 2002. Ironically, the 2021 Baylor Bears had the most difficult path in the same timeframe.

Finally, it is time to wrap up this series with a deep-dive into the overall odds of the NCAA Tournament as well as the odds of picked a perfect bracket in an office pool. As we shall see, it is possible to quantify the Madness.

Overall Tournament Odds

Throughout this series and in my annual Tournament preview, I have outlined a variety of tools that can be deployed in order to gain a deeper understanding of the way the NCAA Tournament actually works. Almost all of them hinge on the use of Kenpom efficiency data to project point spreads and victory probabilities for any arbitrary tournament match-up.

Using these tools, it is possible to calculate the odds for any team to win any of the NCAA Tournaments back to 2002 when Kenpom began tracking this data. When all of this data is taken together, a big picture emerges as to the chances for any team to cut down the nets. Figure 1 below summarizes this data.

Figure 1: Odds for every NCAA Tournament team to win the National Title for the 2002-2022 seasons using both a linear scale (left) and a log scale (right).

As we can see, the best pretournament odds of any team in the last 20 years are just slightly better than 35 percent, which were the odds that Gonzaga had prior to the 2021 tournament. Other notable teams whose odds were greater than 25 percent include the 2002 Duke team, the 2015 Kentucky team, the 2008 Kansas team, and the 2019 Virginia team. 

Note that the difference in odds shown in Figure 1 for teams with similar pretournament Kenpom odds are due entirely to difference in the tournament draws for each team. This topic was covered in detail in the previous installment of this series.

Of the seven total teams whose odds were greater than 25 percent entering the tournament, only two of those teams (Kansas is 2008 and Virginia in 2019) actually won the National Title, which is dead on the expected value of 2.13, based on the calculated odds. In other words, the #math checks out.

The bottom line is that winning the NCAA Tournament is hard. Even teams that finish the season with a Kenpom efficiency margin of +30.0 or greater average just one-in-five odds of cutting down the nets. A historical average No. 1 seed has odds of only 14 percent.

Lower seeded teams have much worse odds. The right panel shows the same data, but listed on a log scale. Interestingly, the total span of Championship odds for the best and worst teams of the last 20 years spans 14 orders of magnitude. 

For those scoring at home, the team with the estimated worst odds in the past 20 years was the 2005 No. 16 seed Alabama A&M team who lost to No. 16 Oakland in the play-in game. My math gave the Bulldogs a one-in-97 trillion chance to win the National Title.

Perfect Bracket Odds

Over the years, many people have dreamed of winning their NCAA Tournament "office" bracket by somehow picking the results of all 63 games correctly (not counting the play-in round). Naturally, this has led many people to attempt to calculate those odds. The internet has a lot of articles that attempt this calculation. Most of them are wrong.

The most trivial way to make this calculation is to assume that all 63 games are coin flips and each team has a 50 percent chance to win each game. If this were the case, the odds of picking the perfect bracket would be about one-on-9.2 quintillion, which is the number that is often sited most frequently. But it is actually a form of upper bound on the real odds.

The reason is that not all games are toss-ups. No. 1 Kansas did not have a 50 percent chance to beat No. 16 Texas Southern this spring. Kansas's odds were closer to 97 percent. In other words, the coin that we use to make the calculation the previous example is loaded. It would only be a "fair" coin in the extreme case. 

As it turns out, the true odds to pick a perfect bracket are based on a certain weighted average (technically the geometric mean) of the odds for the favored team to win each tournament game. This weighted average is a function of the specific strengths of each team in any given tournament, which means that the odds of picking all of the games correctly vary from year-to-year. 

The actual weighted average is around 58 percent (and not 50 percent) based on data for the past 20 tournaments. The value tells us that the real odds to pick a perfect bracket are closer to one-in-540 trillion. That is still a really large number, but it is 17,000 times more likely than the value that most people reference.

It is also possible to calculate the lower bound for the odds to pick a perfect bracket. These odds occur in the scenario where the favored teams win all 63 games in the Tournament. Effectively, the Tournament would proceed according to "chalk."

In this scenario, the weighted average of the hypothetical coin is closer to 68 percent, on average. Based on this value, the lower bound for the odds to select the perfect bracket has averaged about one-in 49 billion over the past 20 tournaments. The real odds are about 11,000 times less likely than this lower bound.

Perfect Bracket Odds Over the Years

Now that it is clear that the odds of a perfect bracket have clear bounds and differ year to year, it is time to visualize what these odds have looked like over the years. Figure 2 provides this summary.

Figure 2: Actual odds of a perfect bracket compared to the "chalk" bracket where the favorite teams win each contest and the average odds resulting from a series of Monte Carlo simulations of each tournament.

As we can see from the orange bars, the "coin" weighted average is between 65 and 70 percent for the most likely, "chalk" brackets. This translates to odds between approximately one-in-one billion and one-in-one trillion. The best possible odds of a perfect bracket would have been in 2015 using a strategy of picking all of the Kenpom favorites to win all 63 games. In that scenario, the odds of being correct were one-in 4.3 billion.

When each tournament was then simulated, the odds dropped significantly, as shown by the striped green bars. Over the past 20 tournaments the geometric average of the odds for a perfect simulated bracket ranged from a high of one-in-60 trillion in 2015 to a low of one-in 10 quadrillion in 2006. Note that the "chalk" data and the simulated data are highly correlated.

It is interesting to note that the odds of selecting a perfect bracket were better in years such as 2015, 2019, and 2021. The odds were worse in years such as 2003 and 2006. In the previous piece in this series, I pointed out that the former set of years were ones where the bracket was particularly strong and the later years were ones where the bracket was particularly weak. 

As a general rule, a stronger bracket should result in fewer upsets and it will therefore be more predictable. While there is a correlation between the simulated odds and the actual odds of a perfect bracket, that correction is quite weak. As the dotted green bars show, the actual odds of correctly picking the results of all 63 games have varied between one-in 3.2 trillion (in 2019) and one-in 350 quadrillion in 2022. 

A comparison of the simulation odds and the actual odds essentially provides a way to quantify the Madness of March. In the years when the actual odds are higher than the average of the simulations (such as 2007, 2008, and 2019) the tournament tended to have fewer upsets total and a larger number of higher seeds advance to the Final Four. For example, 2008 is the only year in history where all four No. 1 seeds advanced to the Final Four.

The opposite is true for the years where the actual odds are significantly worse than the simulated odds. In those years there was an above average amount of Madness due to a large number of upsets, the occurrence of major upsets (such as a No. 15 seed beating a No. 2 seed) or both. These years also tend to result in lower seeds advancing to the Final Four. 

To highlight a few examples, in 2011 a No. 8 (Butler) and a No. 11 seed (VCU) made the Final Four. In 2018, No. 1 seed Virginia lost to No. 16 seed UMBC and a No. 11 seed (Loyola Chicago) made the Final Four. In 2021, No. 2 seed Ohio State lost in the first round and No. 11 seed UCLA made the Final Four. In 2022, No. 15 seed Saint Peter's makes the regional Final and No. 8 North Carolina reached the Final game.

When it comes to unlikely events in the NCAA Tournament, No. 1 seed Virginia's loss in the first round to No. 16 seed UMBC is usually the event cited as being the most "Mad." However, the statistics (based on the Vegas spread) suggest that this type of upset should occur in about one percent of all games. In other words, we should expect to see a No. 1 seed go down about once every 25 years.

However, a No. 15 seeds advancing to the Regional Finals (as Saint Peter's did this year) had odds of roughly 0.18 percent or one-in-550. This suggests that this type of event should only happen once every 140 tournaments. The math suggests that no one alive will likely ever witness such an unlikely Tournament run again in their lifetime. 

With that, it is time to finally put a bow on the college basketball season. Until next time, enjoy, and Go Green.

Comments

Popular posts from this blog

Dr. Green and White Helps You Fill Out Your Bracket (2024 Edition)

For as long as I can remember, I have loved the NCAA Basketball Tournament. I love the bracket. I love the underdogs. I love One Shining Moment. I even love the CBS theme music. As a kid I filled out hand-drawn brackets and scoured the morning newspaper for results of late night games. As I got older, I started tracking scores using a increasing complex set of spreadsheets. Over time, as my analysis became more sophisticated, I began to notice certain patterns to the Madness I have found that I can use modern analytics and computational tools to gain a better understanding of the tournament itself and perhaps even extract some hints as to how the tournament might play out. Last year, I used this analysis to correctly predict that No. 4 seed UConn win the National Title in addition to other notable upsets. There is no foolproof way to dominate your office pool, but it is possible to spot upsets that are more likely than others and teams that are likely to go on a run or flame out early.

The Case for Optimism

In my experience there are two kinds of Michigan State fans. First, there are the pessimists. These are the members of the Spartan fan base who always expect the worst. Any amount of success for the Green and White is viewed to be a temporary spat of good luck. Even in the years when Dantonio was winning the Rose Bowl and Izzo was going to the Final Four, dark times were always just around the bend. Then, there are the eternal optimists. This part of the Spartan fan base always bets on the "over." These fans expect to go to, and win, and bowl games every year. They expect that the Spartans can win or least be competitive in every game on the schedule. The optimists believe that Michigan State can be the best Big Ten athletic department in the state. When it comes to the 2023 Michigan State football team, the pessimists are having a field day. A major scandal, a fired head coach, a rash of decommitments, and a four-game losing streak will do that. Less than 24 months after hoi

2023 Final Playoff and New Year's Six Predictions

The conference championships have all been played and, in all honesty, last night's results were the absolute worst-case scenario for the Selection Committee. Michigan and Washington will almost certainly be given the No. 1 and No. 2 seed and be placed in the Sugar Bowl and the Rose Bowl respectively. But there are four other teams with a reasonable claim on the last two spots and I have no idea what the committee is going to do. Florida State is undefeated, but the Seminoles played the weakest schedule of the four candidates and their star quarterbac (Jordan Travis) suffered a season ending injury in the second-to-last game of the regular season. Florida State is outside of the Top 10 in both the FPI and in my power rankings. I also the Seminoles ranked No. 5 in my strength of record metric, behind two of the other three candidates. Georgia is the defending national champions and were previously ranked No. 1 coming into the week. But after losing to Alabama in the SEC Title game,