Ever since I was a kid, I have been fascinated by March Madness. As a teenager, I created hand-drawn brackets, many of which still reside in a laminated green folder on my book shelf. As a young adult, I started recording the results on a simple spreadsheet. Over the years, that spreadsheet became more complex and powerful.
Today, I have found that I can use modern analytics and computational tools to gain a better understanding of the tournament itself and perhaps even how to get some hints as to how the tournament might play out.
While there is no foolproof way to dominate your office pool, I have discovered a few tricks along the way that I find helpful. While we wait for the games to begin in earnest on Thursday, I will share some of what I have learned and how it applies to the 2023 bracket.
Methodology Overview
All of my analysis of college basketball odds is based on this same premise. Kenpom efficiency data can be used to assign probabilities to any arbitrary basketball match-up. Knowing this, the full season and any tournament can be mathematically modeled and its odds can be calculated.
My favorite plot to highlight this fact is shown below.
Figure 1: Correlation between NCAA Tournament upsets and the odds predicted using Kenpom efficiency data. |
This figure compares the winning percentage for the higher seeds in the NCAA Tournament to the odds expected based on the average point spread of games with that seed combination. The figure shows that data for all seed combinations that have occurred at least 40 times.
Figure 1 tells us why No. 15 seeds have won 10 times over the past 37 years (7% of the time). It is because on average No. 15 seeds are 15-point underdogs and 15-point underdogs win straight up 7% of the time whether the game in played in March or in November.
There are a few notable deviations from this correlation. For example, No. 10 seeds have surprisingly good luck against No. 2 seeds and No. 5 seeds do not upset No. 1 seeds in the Sweet 16 as often as expected. But in general, the correlation is very strong.
As for the correlation between the Vegas points spreads and the point differentials predicted by Kenpom efficiency margins, Figure 2 below shows how strong this correlation is for the first-round games in the 2023 NCAA Tournament.
Figure 2: Correlation between the Vegas lines and the point differentials predicted using Kenpom efficiency margins. The right panel shows an enlarged view of the full data set. |
Figure 2 gives me confidence Kenpom efficiencies can be used to model the results of the NCAA Tournament.
2023 Bracket Overview
This year there has been a lot of discussion about parity. The Big Ten had one of the strangest seasons on record which included 11 teams finishing within three games of each other. Parity seems to be present in every corner of college basketball and the overall field appears to be weaker in 2023 than in years past. How will this impact March Madness?
I attempted to explore this question by simulating the results of the 2023 tournament 5,000 times and counting the number of upsets that occurred in each round. I then compared these values to simulations of previous tournaments and to the results of those actual tournaments. The results are shown below in Figure 3.
Figure 3 demonstrates the accuracy of the simulation methodology relative to the actual number of observed upsets. The historical simulations match the observations within about half of an upset each round.
In addition, the results of the 2023 simulation suggest that we will likely see a few more upsets than usual in this year's NCAA Tournament, especially in the first weekend. A typical tournament averages 13 upsets total in the first two rounds. This year's simulation suggests that 14 or more should be expected.
Interestingly, the simulation also gives slightly higher than average odds for upsets in the Sweet 16 and Final Four rounds, but the Regional Final (Elite Eight) round might be a bit quieter.
Speaking of the Final Four, I am also able to use the results of my Monte Carlo simulation to project the distribution of seeds that will advance to the final weekend. Most of the analysts that appear on television will frequently select three or even all four No. 1 seeds to make the Final Four. The reality is quite different.
Figure 4 shows the actual distribution of seeds in the Final Four from 2002 to 2022 (right panel). The left panel shows the distribution that I obtained in my simulation of the 2023 tournament.
Figure 4: Historical seed distribution in the Final Four (left panel) and the results of a simulation of the 2023 tournament (right panel) |
History shows that a "typical" Final Four actually is made up of a No. 1 seed, and No. 1 or a No. 2 seed, a No. 2 or No. 3 seed, and one lower seed that averages to a No. 6 seed. More than two No. 1 seeds have made it to the Final Four only five times since seeding began in 1979.
Based on the data from this year's simulation, the Final Four is likely to be made up of more lower seeds than usual. The highest seed is still very likely to be a No. 1 seed, but there is over a 20% chance that none of the No. 1 seeds make it.
The second-best seed is likely a No. 1 or a No. 2 seed, but the odds that it is a No. 3, 4, or No. 5 seed are higher than usual. The third highest seed in 2023 looks to be a toss-up between a No. 2, No. 3, and a No. 4 seed. As for the lowest seed in the 2023, the odds suggest that there is an 85% chance that it is a No. 4 seed or lower.
Region-By-Region Overview
In part one of this series, I summarized the methodology that I use to break down and analyze the NCAA Tournament bracket. I use Kenpom efficiency data to estimate point spreads and I leverage those point spreads to simulate the tournament and to calculate the odds of various outcomes.
The initial look at the bracket suggests that there will be slightly more upsets than usual in 2023 and the Final Four will very likely include several teams not on the No. 1 seed line. But which upsets will actually take place? Once again, the #math can help. Let's now look at each region in turn and then make a few informed predictions.
South Region
Table 1 below summarizes all of the relevant data for the South Region.
Table 1: 2023 NCAA Tournament South Region odds |
This table gives a lot of information that we will use to make our picks. The left side of the table shows the pre-tournament Kenpom adjusted efficiency margin for each team. The shaded cells on the left side of the table provide a comparison of each team's efficiency relative to the historical average of teams of that seed. This provides a simple way to look at the relative strength or weakness of each team and the bracket as a whole.
The middle of the table shows the odds for each team to advance through each round of the tournament. The teams are sorted not by seed, but by the odds for each team to advance to the Final Four. The red or green shaded cells on the far right are the relative odds for each team to advance compared to historical averages for that seed.
Finally, I added a new column for this year labeled "SoD" which stands for "strength of draw." This calculation starts with the odds for a historically average No. 1 seed to advance to the Final Four from any of the 16 positions on this year's bracket. I then compare those odds to the odds that the same historically average No. 1 seed would have to reach the Final Four in a historically average NCAA Tournament bracket.
Overall, Table 1 shows us that while the top three seeds (Alabama, Arizona, and Baylor) have the best odds to win the region, all three teams are relatively weak compared to a normal No. 1, No. 2, or No. 3 seed. Virginia is also an exceptionally weak No. 4 seed.
As for possible South Region sleeper, No. 5 San Diego State, and especially No. 6 Creighton, No. 9 West Virginia, and No. 10 Utah State are relatively strong. This is a great hint for where some potential first and second round upsets might occur.
The strength of draw metric (SoD) for all 16 teams in positive, which suggests that the 2023 South Region is relatively weak compared to past tournaments. Once again San Diego State, Creighton, and Utah State grade out with having very good draws relative to typical teams on their seed lines.
But, overall Alabama still grades out with slightly above average odds to advance to the Final Four, which is the result that I feel is the most likely.
Midwest Region
Table 1 below summarizes all of the relevant data for the Midwest Region.
Table 2: 2023 NCAA Tournament Midwest Region odds. |
Similar to the South Region, the Midwest Region also appears to be relatively weak. No. 1 Houston is barely an above average No. 1 seed, which is notable as the Cougars have the best overall odds to win the National Title at 17%.
However, seeds No. 2 through No. 6 are all very below average. The only relatively above average teams among the single-digit seeds are No. 7 Texas A&M and No. 9 Auburn. As a result, it grades out as an easy draw for Houston, as well as for No. 12 Drake and No. 13 Kent State.
But, Houston is still essentially a mid-major team who now may be playing without star guard Marcus Sasser. Can they be trusted to win four games in a row and advance to the Final Four? I have my doubts.
In all honesty, No. 16 Northern Kentucky is one of the few other above average seeds in the region. That No. 1 versus No. 16 match-up looks a like a traditional No. 2 versus No. 15 match-up. I certainly would not recommend making this pick in your office bracket... but it is notable.
West Region
Table 3 below summarizes all of the relevant data for the West Region.
|
East Region
|
First Round Upset Analysis
Figure 1: Odds for the higher seeded teams to win for each seed pairing, relative to the historical odds (shown in blue) for all first round games. |
- No. 10 Utah State over No. 7 Missouri
- No. 10 Boise State over No. 7 Northwestern
- No. 9 Auburn over No. 8 Iowa
- No. 9 West Virginia over No. 8 Maryland
- No. 9 Florida Atlantic over No. 8 Memphis.
- No. 12 Drake over No. 5 Miami
- No. 13 Kent State over No. 4 Indiana
Second Round Upsets
Figure 2: Odds for the higher seeded teams to win for each seed pairing, relative to the historical odds (shown in blue) for all first round games. |
- No. 5 San Diego State over No. 4 Virginia
- No. 6 Crieghton over No. 3 Baylor
- No. 6 Kentucky over No. 3 Kansas State
- No. 8 Arkansas over No. 1 Kansas
- No. 6 Iowa State over No. 3 Xavier
- No. 10 Utah State over No. 2 Arizona
- No. 8 Florida Atlantic over No. 1 Purdue
Comments
Post a Comment