Hoops Analysis: NCAA Tournament Easy Roads and Hard Ones

The 2021 NCAA Basketball Tournament and season have come to a close, but a new season always provides new data and new stories to tell about that data. In the 2021 Tournament, one story line was the apparent ease at which No. 2 seed Houston was able to reach the Final Four. The Cougars' path to the final weekend went through a No. 15 seed (Cleveland State), a No. 10 seed (Rutgers), a No. 11 seed (Syracuse), and finally a No. 12 seed (Oregon State).

This marked the first time in history that a team had reached the Final Four without facing a single-digit seed. By some measure, this implies that Houston had the easiest path in history to the Final Four. But, for me, this type of discussion always begs the question of how to quantify something like the difficulty of a given NCAA Tournament path.

My approach to try to answer this question is to try to define a benchmark or reference team and to then calculate the odds that this hypothetical team would reach the Final Four given any arbitrary tournament path. Fortunately, tempo-free metrics such as Kenpom efficiency margins provide just such an opportunity to quantify these odds.

Historical data suggests that an average Final Four team since 2002 has a Kenpom adjusted efficiency margin of around +25.4. (This means that an average Final Four team would be expected to beat an average division one team by about 25 points in a game made up of 100 total possessions for each team.) This value is very similar to the pre-tournament efficiency of MSU's 2005 Final Four team. So, this is effectively the reference team.

Using efficiency data, it is possible to a project a point spread and therefore a victory probability for any arbitrary team versus this reference team as long as the efficiency margin data is available. This is generally the case for all teams back to 2002 on Kenpom.com.

Calibrating the effect of bracket position

As a first step, I wanted to understand the general benefit that teams get from earning a higher seed. To achieve this, I set up a simulation of sorts involving a theoretical bracket made up entirely of teams with the historically average efficiency margin for teams of that seed.

For example, the average efficiency margin of all No. 1 seeds back to 2002 is +28.90. This corresponds to a team such as Michigan State's No. 1 seeded team in 2012. An average No. 2 seed historically has an efficiency margin of +23.5, which is similar to MSU's 2009 team, and so on. These teams make up the theoretical bracket.

I then calculated the odds that the reference team (MSU's 2005 team) would make the Final Four if they were inserted into this bracket of average teams as any seed, No. 1 all the way to No. 16. I also assumed that in every round the reference faces the highest seeded available opponent (i.e. that there are no upsets). The result of this set of calculations is shown below in Figure 1.

Figure 1: Odds of a reference, average Final Four team making the Final Four in a bracket of historically average teams if the reference teams were to be inserted as any seed and no upsets occur.

This figure shows us the true benefit of being a top seed. In this scenario, the No. 1 seed has a shade over a one-in-five chance to win the four games needed to make the Final Four. As a No. 2 seed, those odds fall by five percentage points to 15 percent. The odds continue to drop to 12 percent for a No. 3 seed, 11 percent for a No. 4 seed, and 10 percent for a No. 5 seed.

Interestingly, once a team drops to a No. 6 seed, the odds for the reference team to reach the Final Four are essentially equal (eight to nine percent) for all seeds from the six-line down to the 16-line. As a reminder, this calculation assumes that the efficiency of the reference team is fixed. So, whether they are a No. 6 seed or a No. 11 seed, they are still equally as good.

This analysis already gives valuable insight. Basically, there is a clear advantage to being a No. 1 or a No. 2 seed. There is a slight advantage to being a No. 3, No. 4 or No. 5 seed, but after that it really doesn't matter with regards to the odds of making a Final Four.

The history of the NCAA Tournament is filled with examples of teams that cycle up right before the tournament, either due injuries that heal or simply inconsistencies. These teams are likely better than the seed that they have been given and the average efficiency that the metrics assign to them. The good news for teams is this position is that whether they are given a No. 3 seed or a No. 11 seed, their odds to make the Final Four are roughly the same, and paths being equal.

Easy Paths and Hard Paths

The problem is, not all paths are equal. Using a similar method, it is possible to estimate the relative ease or difficulty of any of the paths that previous Final Four teams have traveled on their way to the final weekend. In this case, the reference team (as good as MSU in 2005) is used, but instead of calculating that team's Final Four odds against a theoretical average bracket, the efficiencies of the teams from real, historical NCAA Tournament paths are used.

For example, to compare the paths of both Baylor and Houston in the 2021 Tournament, I first looked up the pre-tournament efficiencies margins for the four opponents of each of those teams en route to their meeting in the 2021 Final Four. As mentioned above, for Houston, these teams were Cleveland State, Rutgers, Syracuse, and Oregon State. For Baylor, these teams were Hartford, Wisconsin, Villanova, and Arkansas.

Then, I estimated the odds that the reference team (Michigan State in 2005) would have to win games against each set of four teams. The product of the odds of each set of four games give the odds to make it to the Final Four on that path.

I made the same calculation for each Final Four team's path to the Final Four back to 2002. I also pulled the numbers for MSU's Final Four teams in 1999, 2000, and 2001 for reference. For comparison, I also calculated the Final Four odds for each path assuming that each opponent was an average team for that seed and not the actual opponent.

For example is the case of Houston, I calculated the odds for the reference team to beat an average No. 15, No. 10, No. 11, and No. 12 seed instead of Houston's actual opponents. The results of this calculation are shown below in Figure 1.

Figure 2: Comparison of the difficulty of different paths to the Final Four, based on the odds that a reference team would reach the Final Four using the path of each team

All 76 teams to play in a Final Four since 2002 (plus the three additional MSU Final Four teams) are shown in this Figure, and there is a lot to observe.

The x-axis shows the actual odds or true normalized difficulty of each team's path. Based on this analysis, it is true the 2021 Houston team did, in fact, have the easiest Final Four path of any team in history. The reference team had a 39 percent chance to win those four games.

The most difficult tournament path in recent history belongs to the 2019 Texas Tech squad. In this case, the reference team only had a seven percent chance to reach the final weekend. Houston's path was five-and-a-half times easier than Texas Tech's path two years prior.

Note that dotted orange line represents the median of the data sets. So, the teams to the right of this line had a path that was easier than average, while the teams on the right side of the graph had a harder than average path.

As for MSU, Coach Izzo has cleared experienced both some of the easiest, as well as some of the most difficult tournament paths in history. Half of Izzo Final Four teams fall to the right of the orange line, while the other half are on the left.

MSU's most difficult path in the Izzo era was in 2015 when the Spartans faced No. 10 Georgia, No. 2 Virginia, No. 3 Oklahoma, and No. 4 Louisville. The Spartan's softest Final Four path was in 2001 when the Spartans faced No. 16 Alabama State, No. 9 Fresno State, No. 12 Gonzaga, and No. 11 Temple. Coach Izzo's other six paths are closer to the median.

The y-axis on Figure 1, which gives the odds for the reference team if they were to face an average version of each seed, gives us some additional insight. If a data point falls above the diagonal line, this implies that the path that team took in reality is actually harder than it appears based simple on the seeds. The opposite is also true. Data points that fall below the diagonal line represent teams whose Final Four path was easier than expected, based on the seeds of the opponents.

These differences can be more easily understood by looking at a selection of the Final Four paths in more detail. Tables 3 and 4 below give the opponent details for the teams that took the 20 easiest and 20 hardest paths to the Final Four.

Table 2: Detailed opponent data for the 20 easiest paths to the Final Four

Table 3: Detailed opponent data for the 20 hardest paths to the Final Four

In the case of Michigan State in 2001, based on just the seeds that the Spartans faced (No. 16, No. 9, No. 12, No. and 11) this path should be the easiest path in history. If those four teams were merely average teams of that seed, the reference team's odds to make the Final Four would be about 43 percent, which is slight easier than the 41 percent odds that the reference team would have using a set of average seeds equivalent to Houston's path in 2021.

But, if the efficiencies of the actual opponents are considered, MSU's path in 2001 drop to seventh place, as shown in Table 2. In this case, for each opponent the Kenpom efficiency margin relative the average margin for that seed in shown in the table.

For MSU in 2001, once the Spartans got past the first round, the next three opponents were above average for their seed. Specially, No. 12 Gonzaga's Kenpom efficiency margin was +2.3 points better than an average No. 12 seed. They were more similar to an average No. 10.

Furthermore, No. 11 Temple's efficiency margin was +6.8 points better than average. That would make the 2001 Temple team more similar to a No. 4 seed. (Unfortunately for the Spartans, their national semifinal opponent in 2001, No. 2 Arizona, was also significantly above average for their seed... and it showed.)

MSU's 1999 Final Four team also had a path that was harder in reality that it might look on paper. Sweet 16 opponent No. 13 Oklahoma and regional final opponent, No. 3 Kentucky were both had Kenpom efficiency margin's significantly above average for their seed.

I should note here that in one of my previous analyses, I suggested that there was data to suggest that MSU, on average, is the most under-seeded team in recent history of the Tournament. But, part of my reasoning was that it seems unlikely that MSU's Tournament opponents were on average, under-seeded. However, the data in Figure 1 suggest that for the 1999 and 2001 Final Four teams, that was certainly the case, as they are two clear outliers. It seems that the committee might be both under-seeded MSU and over-seeded the teams in MSU's path.

One the other side of the coin, there are several notable teams whose NCAA Tournament paths were actually quite a bit easier than they appear just based on seeding. For example, the 2005 Illinois team, the 2004 UCONN team, and especially the 2006 UCLA team all had paths that grade out as weak, not just due to the seeds that they faced, but also due to opponents that Kenpom data suggests were overrated.

UCLA's 2006 path to the Final Four was particularly soft. All four of their opponents were notable below average, based on Kenpom. UCLA's Sweet 16 and regional final opponents (No. 3 Gonzaga, -7.6 and No. 1 Memphis, -6.6) were particularly weak. The data suggests that those two teams graded out close to a No. 11 seed and a No. 3 seed, respectively.

Finally, using the same method, it is also straightforward to quantity the level of difficulty for each team to both reach the Champion game and to win the National Title. Briefly, the top five easiest paths to the Championship game (with the normalize odds for the reference team) are:

North Carolina in 2016 (22.8 percent)
UCLA in 2006 (21.7 percent)
Texas in 2003 (21.7 percent)
Michigan in 2018 (20.2 percent)
Illinois in 2005 (21.1 percent)

Here are the top five overall easiest NCAA Tournament paths (including the title game, for those that made it):

UCLA in 2006 (11.7 percent)
North Carolina in 2016 (10.6 percent)
Florida in 2006 (8.9 percent)
Villanova in 2018 (8.0 percent)
Louisville in 2013 (7.8 percent)

That's all for today. Until next time, enjoy, and Go Green.

Dr. Green and White Sports Authority

Search This Blog