Skip to main content

March Madness Metrics: Quantifying the Madness

It has been almost two full months since the 2021-22 college basketball came to a close with the Kansas Jayhawks' come from behind victory over the North Carolina Tar Heels in the National Championship game. But data does not have an off-season.

So far this spring, we have examined the NCAA Tournament resumes of the top college head coaches over the past 40 years. We have found that just based on the raw numbers, Michigan State University Head College Tom Izzo is among the best. When it comes to performance compared to expectation, Coach Izzo is the best of all time.

We also explored and quantified the difficulty of both the draw and actual path of several notable teams over the past 20 tournaments. Interestingly, the data suggests that the 2022 Jayhawks had the easiest path to a Title of any team since 2002. Ironically, the 2021 Baylor Bears had the most difficult path in the same timeframe.

Finally, it is time to wrap up this series with a deep-dive into the overall odds of the NCAA Tournament as well as the odds of picked a perfect bracket in an office pool. As we shall see, it is possible to quantify the Madness.

Overall Tournament Odds

Throughout this series and in my annual Tournament preview, I have outlined a variety of tools that can be deployed in order to gain a deeper understanding of the way the NCAA Tournament actually works. Almost all of them hinge on the use of Kenpom efficiency data to project point spreads and victory probabilities for any arbitrary tournament match-up.

Using these tools, it is possible to calculate the odds for any team to win any of the NCAA Tournaments back to 2002 when Kenpom began tracking this data. When all of this data is taken together, a big picture emerges as to the chances for any team to cut down the nets. Figure 1 below summarizes this data.

Figure 1: Odds for every NCAA Tournament team to win the National Title for the 2002-2022 seasons using both a linear scale (left) and a log scale (right).

As we can see, the best pretournament odds of any team in the last 20 years are just slightly better than 35 percent, which were the odds that Gonzaga had prior to the 2021 tournament. Other notable teams whose odds were greater than 25 percent include the 2002 Duke team, the 2015 Kentucky team, the 2008 Kansas team, and the 2019 Virginia team. 

Note that the difference in odds shown in Figure 1 for teams with similar pretournament Kenpom odds are due entirely to difference in the tournament draws for each team. This topic was covered in detail in the previous installment of this series.

Of the seven total teams whose odds were greater than 25 percent entering the tournament, only two of those teams (Kansas is 2008 and Virginia in 2019) actually won the National Title, which is dead on the expected value of 2.13, based on the calculated odds. In other words, the #math checks out.

The bottom line is that winning the NCAA Tournament is hard. Even teams that finish the season with a Kenpom efficiency margin of +30.0 or greater average just one-in-five odds of cutting down the nets. A historical average No. 1 seed has odds of only 14 percent.

Lower seeded teams have much worse odds. The right panel shows the same data, but listed on a log scale. Interestingly, the total span of Championship odds for the best and worst teams of the last 20 years spans 14 orders of magnitude. 

For those scoring at home, the team with the estimated worst odds in the past 20 years was the 2005 No. 16 seed Alabama A&M team who lost to No. 16 Oakland in the play-in game. My math gave the Bulldogs a one-in-97 trillion chance to win the National Title.

Perfect Bracket Odds

Over the years, many people have dreamed of winning their NCAA Tournament "office" bracket by somehow picking the results of all 63 games correctly (not counting the play-in round). Naturally, this has led many people to attempt to calculate those odds. The internet has a lot of articles that attempt this calculation. Most of them are wrong.

The most trivial way to make this calculation is to assume that all 63 games are coin flips and each team has a 50 percent chance to win each game. If this were the case, the odds of picking the perfect bracket would be about one-on-9.2 quintillion, which is the number that is often sited most frequently. But it is actually a form of upper bound on the real odds.

The reason is that not all games are toss-ups. No. 1 Kansas did not have a 50 percent chance to beat No. 16 Texas Southern this spring. Kansas's odds were closer to 97 percent. In other words, the coin that we use to make the calculation the previous example is loaded. It would only be a "fair" coin in the extreme case. 

As it turns out, the true odds to pick a perfect bracket are based on a certain weighted average (technically the geometric mean) of the odds for the favored team to win each tournament game. This weighted average is a function of the specific strengths of each team in any given tournament, which means that the odds of picking all of the games correctly vary from year-to-year. 

The actual weighted average is around 58 percent (and not 50 percent) based on data for the past 20 tournaments. The value tells us that the real odds to pick a perfect bracket are closer to one-in-540 trillion. That is still a really large number, but it is 17,000 times more likely than the value that most people reference.

It is also possible to calculate the lower bound for the odds to pick a perfect bracket. These odds occur in the scenario where the favored teams win all 63 games in the Tournament. Effectively, the Tournament would proceed according to "chalk."

In this scenario, the weighted average of the hypothetical coin is closer to 68 percent, on average. Based on this value, the lower bound for the odds to select the perfect bracket has averaged about one-in 49 billion over the past 20 tournaments. The real odds are about 11,000 times less likely than this lower bound.

Perfect Bracket Odds Over the Years

Now that it is clear that the odds of a perfect bracket have clear bounds and differ year to year, it is time to visualize what these odds have looked like over the years. Figure 2 provides this summary.

Figure 2: Actual odds of a perfect bracket compared to the "chalk" bracket where the favorite teams win each contest and the average odds resulting from a series of Monte Carlo simulations of each tournament.

As we can see from the orange bars, the "coin" weighted average is between 65 and 70 percent for the most likely, "chalk" brackets. This translates to odds between approximately one-in-one billion and one-in-one trillion. The best possible odds of a perfect bracket would have been in 2015 using a strategy of picking all of the Kenpom favorites to win all 63 games. In that scenario, the odds of being correct were one-in 4.3 billion.

When each tournament was then simulated, the odds dropped significantly, as shown by the striped green bars. Over the past 20 tournaments the geometric average of the odds for a perfect simulated bracket ranged from a high of one-in-60 trillion in 2015 to a low of one-in 10 quadrillion in 2006. Note that the "chalk" data and the simulated data are highly correlated.

It is interesting to note that the odds of selecting a perfect bracket were better in years such as 2015, 2019, and 2021. The odds were worse in years such as 2003 and 2006. In the previous piece in this series, I pointed out that the former set of years were ones where the bracket was particularly strong and the later years were ones where the bracket was particularly weak. 

As a general rule, a stronger bracket should result in fewer upsets and it will therefore be more predictable. While there is a correlation between the simulated odds and the actual odds of a perfect bracket, that correction is quite weak. As the dotted green bars show, the actual odds of correctly picking the results of all 63 games have varied between one-in 3.2 trillion (in 2019) and one-in 350 quadrillion in 2022. 

A comparison of the simulation odds and the actual odds essentially provides a way to quantify the Madness of March. In the years when the actual odds are higher than the average of the simulations (such as 2007, 2008, and 2019) the tournament tended to have fewer upsets total and a larger number of higher seeds advance to the Final Four. For example, 2008 is the only year in history where all four No. 1 seeds advanced to the Final Four.

The opposite is true for the years where the actual odds are significantly worse than the simulated odds. In those years there was an above average amount of Madness due to a large number of upsets, the occurrence of major upsets (such as a No. 15 seed beating a No. 2 seed) or both. These years also tend to result in lower seeds advancing to the Final Four. 

To highlight a few examples, in 2011 a No. 8 (Butler) and a No. 11 seed (VCU) made the Final Four. In 2018, No. 1 seed Virginia lost to No. 16 seed UMBC and a No. 11 seed (Loyola Chicago) made the Final Four. In 2021, No. 2 seed Ohio State lost in the first round and No. 11 seed UCLA made the Final Four. In 2022, No. 15 seed Saint Peter's makes the regional Final and No. 8 North Carolina reached the Final game.

When it comes to unlikely events in the NCAA Tournament, No. 1 seed Virginia's loss in the first round to No. 16 seed UMBC is usually the event cited as being the most "Mad." However, the statistics (based on the Vegas spread) suggest that this type of upset should occur in about one percent of all games. In other words, we should expect to see a No. 1 seed go down about once every 25 years.

However, a No. 15 seeds advancing to the Regional Finals (as Saint Peter's did this year) had odds of roughly 0.18 percent or one-in-550. This suggests that this type of event should only happen once every 140 tournaments. The math suggests that no one alive will likely ever witness such an unlikely Tournament run again in their lifetime. 

With that, it is time to finally put a bow on the college basketball season. Until next time, enjoy, and Go Green.

Comments

Popular posts from this blog

Dr. Green and White Helps You Fill Out Your Bracket (2024 Edition)

For as long as I can remember, I have loved the NCAA Basketball Tournament. I love the bracket. I love the underdogs. I love One Shining Moment. I even love the CBS theme music. As a kid I filled out hand-drawn brackets and scoured the morning newspaper for results of late night games. As I got older, I started tracking scores using a increasing complex set of spreadsheets. Over time, as my analysis became more sophisticated, I began to notice certain patterns to the Madness I have found that I can use modern analytics and computational tools to gain a better understanding of the tournament itself and perhaps even extract some hints as to how the tournament might play out. Last year, I used this analysis to correctly predict that No. 4 seed UConn win the National Title in addition to other notable upsets. There is no foolproof way to dominate your office pool, but it is possible to spot upsets that are more likely than others and teams that are likely to go on a run or flame out early....

2024 Week Eight Preview: OK Computer

Playing the first game after a bye week is like waking up from a nap. It is a little tough to predict how the body will respond. If a nap comes at just the right time and lasts for just the right length of time, it can be very refreshing and rejuvenating. But sometimes waking up for a nap can be rough. It can cause a disorienting, groggy feeling like suddenly two plus two equals five and that down is the new up. Based on the way the three weeks prior to the bye week went, last week's break at the midpoint of the season came at exactly the right time for the Spartans. Facing one top five team is challenging enough. Facing two top five teams on consecutive weekends including almost 5,000 miles of travel is something else entirely. But how will the rested Spartans look on the field come Saturday night? It is hard to predict what we are going to get. It is the classic "rest versus rust," million dollar question.  I prefer to be optimistic and to believe that the Spartans will...

2024 Week Seven Preview: Intermission

It is hard to believe that we are already halfway through the Michigan State Spartans' 2024 season. The Green and White currently sit at 3-3, having just lost two games straight to teams both ranked in the top three nationally.  Despite the current losing streak, Michigan State is actually slightly ahead of schedule. While the Spartans' schedule currently grades out to be harder than expected when I conducted the analysis this summer (by 0.7 games), Michigan State's current odds to go to a bowl game (46%) are 10 percentage points higher than what I projected.  In Week Seven, Michigan State has drawn a much needed bye. Think about it as an intermission of sorts. The Spartans' mission this weekend is to rest, heal, reflect on the first half of the season, and prepare for back half of the schedule with the goal of qualifying for the bowl game. Michigan State's team and staff may be taking it easy, but data and Vegas never sleep. Today's piece will focus more on the...