Back in April, the Men's NCAA Tournament came to its conclusion with the No. 1 seed Kansas Jayhawks defeating the No. 8 seed North Carolina Tar Heels to claim the 2022 National Title. As does every NCAA Champion since the mid-1980s, the Jayhawks had to beat six other teams over three weekends to earn the right to cut down the nets.
Clearly, not all paths in the NCAA Tournament are equally difficult. The entire structure of the tournament is based on the idea that the better teams (i.e. higher seeds) are given the easiest draw. If the tournament were to play out as planned with the higher seeds always winning, the higher seeds would wind up with the easiest path to the Final Four. But of course, this never actually happens. Sometimes the occasional No. 15 seed wins a game (or three), busts the bracket and clears the path for a No. 8 seed to survive to the Final weekend.
That said, understanding the true difficulty or ease of a specific NCAA Tournament draw or path is more complex than simply looking at seeds. For while all paths are not equal, neither are all seeds. Fortunately, we have certain metrics available to understand these differences more quantitatively. Team efficiency data, such as the values produced by Ken Pomeroy, can accurately predict point spreads and victory probabilities in ways that allow for very accurate modeling of NCAA Tournaments both past, present, and future.
Just to give an example, based on Kenpom data, the best No. 1 seed since at least 2002 is the 2015 Kentucky Wildcats (with an adjusted efficiency margin of +37.43). The weakest No. 1 seed in the same timeframe is the 2018 Xavier Minutemen (+21.69). If those two teams were somehow able to play each other, the math suggests that Kentucky would be favored by about 10.5 points. This spread is actually closer to the average line expected for a second-round game between a No. 1 seed and a No. 8 seed.
Using a similar analysis, I have devised a way quantify the difficulty of any given NCAA Tournament draw and path. The key idea is to replace any given team in any past NCAA Tournament bracket with a historically average Final Four team. Tom Izzo's 2005 Michigan State team is a great benchmark with a pretournament adjusted efficiency margin of 25.62, which was good enough for No. 6 overall in Kenpom going into March Madness (despite that team earning just a No. 5 seed that year).
By artificially placing this benchmark team in literally every position in every NCAA Tournament bracket back to 2002, it is possible to calculate that team's odds to advance through every round of the tournament. Two different calculations can be made. First, it is possible to calculate overall odds based on the full, original bracket. I will refer to this calculation as the "draw." This calculation takes into account the odds of various upsets through the bracket, whether they occurred or not. It is equivalent to the odds that can be calculated at the beginning of each tournament.
The second calculation is based on the actual path that each team took through the tournament. For example, it considers that odds for a team like the 2022 North Carolina team to advance knowing that they faced No. 15 seed St. Peter's as opposed to a team like No. 2 Kentucky or No. 3 Purdue in the Regional Final. But, the odds of North Carolina's "path" are calculated assuming that the Tar Heels had the same efficiency as the 2005 Michigan State reference team.
Comparison to Historical Averages
Before jumping directly into the numbers for a given tournament, it is helpful to understand the context for some of the numbers to come. To this end, I set up a set of calculations based on a reference tournament comprised of an imaginary set of perfectly average set of teams that each are assigned an average Kenpom efficiency for their seed. It is possible to calculate the odds for each seed to advance through the tournament. Furthermore, it is also possible to calculate the difficulty of the draw and path based on the methods described above. Figure 1 below gives the results of these calculations.
Figure 1: Difficulty of NCAA Tournament paths and specific draws for a simulated tournament where all teams are historically averarage for their seed |
Three different data sets are shown. Based on the definitions above, the green bars in the left panel represent the average draw for each seed. In other words, these are the odds that the reference Final Four team (2005 Michigan State) would have to advance to the Final Four if placed in the bracket as anywhere between the No. 1 seed and the No. 16 seed.
Besides just being a useful reference, the green bars provide a sense of how much of an advantage it is to be a higher seed. A No. 1 seed has a 27 percent chance to win the region. Those odds drop to 22 percent for a No. 2 seed placement, 19 percent for a No. 3 seed placement, and 16 percent for a No. 4 seed placement.
For No. 5 seeds and lower, the odds start to level off between 11 and 12 percent. As one might expect, it is slightly better to be located in the "bottom" of the bracket, away from the No. 1 seed. In fact, of the lower seeds, the sweet spot is the No. 11 seed. This makes sense, as the No. 11 seed avoids playing the No. 2 and No. 1 seed the longest, which means there is a slightly higher chance than an upstream upset will make the actual path easier. This may help to explain why No. 11 seeds have more Final Four appearance (five) than the remaining seeds No. 9 to No. 16 combined (three) since 1979.
The other two data sets in Figure 1 represent the two extremes of specific paths that the reference team could take through the bracket. The left panel shows the "chalk" path where the reference team would face the highest possible seed in each round. For example, it would assume that the No. 1 seed would play the No. 16 seed, the No. 8 seed, the No. 4 seed, and finally the No. 2 seed on the path to the Final Four. On average, the real path tends to be between four and six percentage points easier than the most difficult possible path for each seed.
Finally, the right panel of Figure 1 shows the odds for the "anti-chalk" path for each seed. As the name implies, this is the easiest possible path that each seed could take, assuming the maximum number of upsets. For the No. 1 seed, this would mean facing the No. 16, the No. 9 seen, the No. 13 seed and finally the No. 15 seed. While this path is extremely unlikely, it would give each seed a significant advantage. Basically, any seed better than a No. 10 seed would have Final Four odds that shoot up to 55 to 60 percent. Once again, this all assumes that the team in question is as good as an average Final Four team, regardless of their assigned seed in this calculation.
Strength of Draw
With that background established, it is now time to look at the results of the draw and path difficulty calculations for all NCAA Tournament teams stretching back to 2002. The full data set with reference to Final Four draw difficulty is shown in Figure 2.
Figure 2: Difficult of all NCAA Tournament draws from 2002-2022. A higher number is an easier draw, as it implies a higher probability of advancing to a Final Four |
In general, the shape of the curve in Figure 2 is similar to the green bars above in Figure 1. However, there appears to be more deviation at the extreme ends of the figure. In other words, there are a handful of NCAA Tournament draws over the years which have been particularly lucky or unlucky. Table 1 below shows the data for some of these teams at the extremes.
Table 1: Extreme examples of easy (top) and hard (bottom) NCAA Tournament draws since 2002 |
Table 1 is sorted based on the difficulty of the draw to advance to the Final Four, but the table contains data for all rounds starting at the Sweet 16.
The top of Table 1 shows a list of some of the easiest NCAA Tournament draws over the past 20 years. The top two spots are held down by the 2002 Maryland Terrapins, who won the National Championship and the 2006 UCLA Bruins, who lost in the title game to No. 3 seed Florida.
A closer look at the makeup of each region gives clues as to why these two draws were so relatively easy. Figure 3 below compares the pretournament Kenpom efficiencies of the teams in the 2002 East Region and the 2006 West Region to the historical average values for those seeds.
Figure 3: Comparison of the Kenpom efficiency margins for the teams in the 2002 East Region and the 2006 West Region relative to historical averages (shown by the blue circles) |
In both cases, No. 1 seed Maryland and No. 2 seed UCLA were grouped with other highly-seeded teams that were historically very below average. In 2002, No. 3 seed Georgia and especially No. 2 UCONN were both very weak. To make the situation even a bit easier, No. 8 Wisconsin and No. 9 St. John's were also surprisingly weak. In 2006, No. 2 UCLA was placed in a region with essentially a pair of mid-majors (No. 1 Memphis and No. 3 Gonzaga) as well as with a weak pair of No. 7 and No. 10 seeds as potential second round opponents.
A similar pattern arises for all of the teams listed in Table 1. A typical "easy" draw usually occurs when two or more of the top seeds in the region are remarkably weak, historically speaking. It also helps when the first round and potential second round opponents are below average.
The opposite is true for the teams at the bottom of the table which represents teams with surprisingly tough draws. In these cases, the region usually has two or three teams on the top few seed lines which are significantly above average. Teams that starts the tournament in the First Four also tend to drift to the bottom of this table due to the fact that they have to play an extra game.
Overall, Table 1 also gives some hints as to years when the tournament field appears to have been relatively weak or relatively strong. For example, the 2003 and 2006 tournament seem to have been particularly weak, as several teams from these years appear at the top of the table. Conversely, the 2015, 2019, and 2021 tournaments all seem to have been relatively strong.
Finally, I should note that Michigan State does have one team that appears in Table 1. The snake-bitten 2016 team ironically had the 17th easiest draw in NCAA Tournament history. That region also contained a fairly average No. 1 seed in Virginia and a very below average No. 3 seed in Utah. No. 10 seed Syracuse wound up winning the Region. Unfortunately, that region also contained a surprisingly good No. 15 seed called Middle Tennessee State.
For completeness, Table 4 below gives the remaining strength of draw data for all of Michigan State's tournament appearances since 2002. In this case, the Final Four difficulty is referenced to the average for that seed to get a relative since of the difficult of the draw to other teams of the same seed.
Table 3: Summary of Michigan State's NCAA Tournament draw difficultly since 2002. |
Strength of Path
Table 4: NCAA Path difficulty for the last 19 National Champions. |
Table 5: NCAA Tournament Path difficulty selected Final Four teams since 2002. |
Comments
Post a Comment