Skip to main content

2021 Bracket Analysis, Part One

The calendar says March, and for sports fans, that means that the Madness has arrived. I have never been that interested in gambling (despite my fascination with the predictive power of the Vegas spread) but every year at this time, I make sure to enter as many on-line March Madness and office pools as I can.

Over the years, I never had that much luck. I did well a few times, but most years my picks flamed out early. But recently, as my interest in sports analytics increased, I developed a certain strategy to make my picks. In 2019, this new methodology worked very well. 

It correctly predicted that Virginia would win the National Title. It predicted that No. 4 seed Auburn was a dark horse Final Four team, and it suggested that the winner of regional final games between No. 1 seed Duke and No. 2 seed Michigan State and No. 1 Gonzaga and No. 3 seed Texas Tech would likely join Virginia and Auburn in Minneapolis.

What I have learned from years of studying the tournament is that while the NCAA Tournament is chaotic, the chaos is predictable, on average.

The reason that tournament games are predictable is that the odds of an upset follow the same "rules" as any other college basketball game. That is, the odds of an upset can be predicted based on the point spread. Furthermore, since the structure of the tournament tends to pair teams together with a similar historical relative strength (i.e. spreads), the "chaos" tend to follow a pattern of sorts.

While Vegas spreads are only available for first round games, predictive tools such as Kenpom efficiencies allow us to project spreads for any arbitrary NCAA Tournament match-up. With these tools in hand, it is possible to both simulate the full tournament and to understand where the upsets are more (or less) likely to occur. 

Overall Upset Probabilities

Before we start to break down the 2021 bracket, it is important to understand the way that a typical NCAA bracket progresses. As an introduction, let's first take a bird's-eye view of the total number of upsets to expect in each round. Figure 1 below presents this data. Note that in all cases, an "upset" refers only to relative seeds of each team and not the Vegas line in any particular game.

Figure 1: Average number of seed upsets per round projected from a simulation of the 2021 Tournament, the averages from the last 18 Tournaments, and the actual number of upsets

There are three different sets of data here that all give similar information. In blue is the number of upsets per round predicted by my most recent simulation of the 2021 NCAA Tournament. In red is the average number of upsets per round from the set of simulations of all past tournaments back to 2002 (when Kenpom data is easily available). Finally, the green bar shows the average number of actual upsets in that same set of Tournaments.

This Figure already tells us a lot. First, I think that it clearly shows the power and accuracy my Monte Carlo simulations of the Tournament. The historical simulation results agree very closely with the actual number of upsets observed. Second, it provides a clear guide for knowing exactly how many upsets to expect.

Specifically, the first round usually has between six and ten upsets per year. The second round typically has between three and seven. The Sweet 16 has one to three, and the regional final round usually has between zero and two. By the time we reach the Final Four, the higher seeds win most of the time. 

As a subtle point to Figure 1, the 2021 Tournament may have slightly fewer upsets than expected in all rounds except the second round, which may be slightly above average.

Upsets Rates Based on Seed Combinations

While knowing the total number of upsets is useful, in order to start making our picks, it is necessary to understand the odds for upset in any given match-up. Figure 2 is my ultimate guide to understanding these odds.

Figure 2: Actual upset frequency for selected seed combinations relative to the odds predicted based on average spreads.

I mentioned above that the upset frequency is completely predictable based on the historical odds derived from the Vegas lines. This Figure demonstrates this fact. For example, everyone's favorite upset pick is for the No. 12 seed to beat a No. 5 seed. This specific upset has occurred in all but five of the past 35 Tournaments. History shows that roughly one-third of all No. 5 seeds lose in the first round.

Based on the Vegas spread, this makes complete sense. The average point spread for a No. 5 / No. 12 match-up is right around five points, and teams favored by five points in college basketball win 69 percent of the time. The upset rate in the Tournament is exactly where it should be.
 
This logic extends perfectly even to the most rare and exciting of all upsets (at least when they happen to somebody else's team). The upset of a No. 2 seed by a No. 15 seed has only happened eight times in history out of 140 games back to 1985 (when the Tournament expanded to 64 teams). That is an upset rate of a little over five percent. The average point spread in a No. 2 / No. 15 game is around 16.5 points which corresponds to an upset rate of... five percent.

Even the most rare upset of all, No. 16 UMBC's epic upset of No. 1 seed Virginia in 2018 was somewhat predictable. With only one occurrence in 140 games, the odds of this type of upset must be around one percent. This also happens to be exactly the odds predicted in games where the spread is about 24 points, which is the case in games between No. 1 and No. 16 seeds, historically.

Upset Rules of Thumb

Without any knowledge at all of the reason behind these upset rates, it is possible to develop a set of good rules of thumb in order to generate a bracket with a historically accurate number and distribution of upsets. Here are some rules that I like to use:
  • For the No. 8 versus No. 9 games, this is basically a toss-up. I usually consult the Vegas line for these games and then go with my gut on each one.
  • The odds of a first round upset of a No. 7, No. 6 or No. 5 seed are all similar at between 35 and 40 percent. This means that between four and five upsets total in this group are expected in any given year.
  • For the teams seeded No. 4 and above, the upset rate drops to 20 percent or less. That said, one or two "big" upsets a year is normal, usually with the No. 3 or No. 4 seeds.
  • For second round games, No. 1 seeds get upset prior to the Sweet 16 almost exactly once every-other year, on average. The rate has been a little higher than that recently. Exactly one No. 1 seed has been knocked out of the second round in seven of the last 10 tournaments. However, this came right after a stretch where all the No. 1 seeds advanced to the Sweet 16 in five straight tournaments.
  • Roughly one-third of all No. 2 seeds do not make it to the Sweet 16. Chances are, at least one will lose to a No. 7 or No. 10 seed in the second round. In 2019, all four No. 2 seeds advanced for the first time since 2009.
  • As for the No. 3 and No. 4 seeds, basically only half of them survive the first weekend, on average.
  • In the Sweet 16 round, the upset rate for No. 1 seeds is about one-in-six and it is about one-in-five for No. 2 seeds. Basically, only two-thirds of all No. 1 seeds make to the regional final and a little under half of the No. 2 seeds usually make it.
  • As for the regional finals, the surviving No. 1 seeds get eliminated in a quarter of these games. Only 40 percent of all No. 1 seeds make it to a Final Four.

The Final Four and Champion

The rules of thumb above work well when applied to each individual region, but the crown jewel of every office pool bracket in the Final Four and eventual champion. Fortunately, there are several of pieces of historical data that can guide this decision making process as well.

For the teams that make up the Final Four, I find the figure below to be the most helpful.

Figure 3: Distribution of seeds in the Final Four from 1979 to 2019

In this case I have grouped the seeds based on the highest seed, second highest seed, third highest, and lowest appearing in the Final Four. For example, 93 percent of the time (in all but three years: 1980, 2006, and 2011) at least one No. 1 seeds makes it to the Final Four. 

However, the odds that the second high seed in the Final Four is also a No. 1 seed (i.e. at least two No. 1 seeds survive to the last weekend) only happens slightly more than half of the time (54 percent). Having three or more No. 1 seeds in the Final Four has only happened six times since the era of Magic Johnson. 

As for the third highest seed, this distribution peaks at the No. 2 seed, but the No. 3 and No. 4 seeds also have fairly high odds. As for the lowest seed to appear in any given Final Four, that is most often a No. 3 seed, but almost all seeds down to a No. 11 seed have a reasonable probability. Only three times in history has than been a Final Four without a team seeded No. 3 or lower.

In other words, a typical Final Four consists of at least one No. 1 seed, another No. 1 seed or a No. 2 seed, another No. 2 seed or a No. 3 seed, and then some other lower seed. 

In selecting the eventual National Champion, there is a very good rule of thumb based on Kenpom efficiency data. In 15 of the past 18 Tournaments, the eventual champion entered the Tournament ranked in the top six of Kenpom overall. In addition, 17 of the past 18 champions have ranked in the top 21 of offensive efficiency, and 16 have ranked in the top 31 of adjusted defensive efficiency. However, only three entered the tournament ranked No. 1 overall Kenpom.

For reference, the current top six teams in Kenpom are Gonzaga, Baylor, Michigan, Iowa, Illinois, and Houston. All six teams in in the top 10 in offense, but Baylor and especially Iowa are outside of the top 30 in defense. This leaves Gonzaga, Michigan, Illinois, and Houston as the most likely national title contenders in 2021.

The analysis above is great for putting together a bracket that looks like it is feasible based on historical trends. I have given you the blueprint to pick the correct distribution of upsets. But, the trick to winning the office pool is to pick the correct upsets, period. Just because we know that one or two No. 12 seeds are likely to update a No. 5 seed doesn't help us if we don't know which one to pick.

Fortunately, I have developed a method that might give you an edge. By carefully applying Kenpom efficiency data to any given bracket, it is possible to spot which upsets are more likely than others. It is possible to identify which region are more likely to proceed according to seed, and which regions are more likely to blow up. It is possible to predict which No. 1 seed is likely to get upset first and which dark horse team is likely to reach the Final Four instead. 

In part two of this analysis, I will walk you through this analysis for the 2021 bracket. Stay tuned.



Comments

  1. Really enjoyed your two parter on this. I’ve been trying to develop some similar analyses for my yearly Calcutta auction but my math is clearly not as strong as yours. If you have the time, I’d love to get some advice from you. Thanks.

    ReplyDelete

Post a Comment

Popular posts from this blog

Dr. Green and White Helps You Fill Out Your Bracket (2024 Edition)

For as long as I can remember, I have loved the NCAA Basketball Tournament. I love the bracket. I love the underdogs. I love One Shining Moment. I even love the CBS theme music. As a kid I filled out hand-drawn brackets and scoured the morning newspaper for results of late night games. As I got older, I started tracking scores using a increasing complex set of spreadsheets. Over time, as my analysis became more sophisticated, I began to notice certain patterns to the Madness I have found that I can use modern analytics and computational tools to gain a better understanding of the tournament itself and perhaps even extract some hints as to how the tournament might play out. Last year, I used this analysis to correctly predict that No. 4 seed UConn win the National Title in addition to other notable upsets. There is no foolproof way to dominate your office pool, but it is possible to spot upsets that are more likely than others and teams that are likely to go on a run or flame out early.

The Case for Optimism

In my experience there are two kinds of Michigan State fans. First, there are the pessimists. These are the members of the Spartan fan base who always expect the worst. Any amount of success for the Green and White is viewed to be a temporary spat of good luck. Even in the years when Dantonio was winning the Rose Bowl and Izzo was going to the Final Four, dark times were always just around the bend. Then, there are the eternal optimists. This part of the Spartan fan base always bets on the "over." These fans expect to go to, and win, and bowl games every year. They expect that the Spartans can win or least be competitive in every game on the schedule. The optimists believe that Michigan State can be the best Big Ten athletic department in the state. When it comes to the 2023 Michigan State football team, the pessimists are having a field day. A major scandal, a fired head coach, a rash of decommitments, and a four-game losing streak will do that. Less than 24 months after hoi

2023 Final Playoff and New Year's Six Predictions

The conference championships have all been played and, in all honesty, last night's results were the absolute worst-case scenario for the Selection Committee. Michigan and Washington will almost certainly be given the No. 1 and No. 2 seed and be placed in the Sugar Bowl and the Rose Bowl respectively. But there are four other teams with a reasonable claim on the last two spots and I have no idea what the committee is going to do. Florida State is undefeated, but the Seminoles played the weakest schedule of the four candidates and their star quarterbac (Jordan Travis) suffered a season ending injury in the second-to-last game of the regular season. Florida State is outside of the Top 10 in both the FPI and in my power rankings. I also the Seminoles ranked No. 5 in my strength of record metric, behind two of the other three candidates. Georgia is the defending national champions and were previously ranked No. 1 coming into the week. But after losing to Alabama in the SEC Title game,