Skip to main content

Dr. Green and White helps you fill out your bracket (2023)

Ever since I was a kid, I have been fascinated by March Madness. As a teenager, I created hand-drawn brackets, many of which still reside in a laminated green folder on my book shelf. As a young adult, I started recording the results on a simple spreadsheet. Over the years, that spreadsheet became more complex and powerful.

Today, I have found that I can use modern analytics and computational tools to gain a better understanding of the tournament itself and perhaps even how to get some hints as to how the tournament might play out. 

While there is no foolproof way to dominate your office pool, I have discovered a few tricks along the way that I find helpful. While we wait for the games to begin in earnest on Thursday, I will share some of what I have learned and how it applies to the 2023 bracket.

Methodology Overview

The foundation of my methodology is an observation that I made several years ago that boils down to this:

When it comes to NCAA Tournament upsets, the behavior is exactly the same as in regular season games. The odds are largely predictable based on Vegas points spreads and by tools that can predict point spreads, such as Kenpom efficiency margin data.

All of my analysis of college basketball odds is based on this same premise. Kenpom efficiency data can be used to assign probabilities to any arbitrary basketball match-up. Knowing this, the full season and any tournament can be mathematically modeled and its odds can be calculated.

My favorite plot to highlight this fact is shown below.

Figure 1: Correlation between NCAA Tournament upsets and the odds predicted using Kenpom efficiency data.

This figure compares the winning percentage for the higher seeds in the NCAA Tournament to the odds expected based on the average point spread of games with that seed combination. The figure shows that data for all seed combinations that have occurred at least 40 times.

Figure 1 tells us why No. 15 seeds have won 10 times over the past 37 years (7% of the time). It is because on average No. 15 seeds are 15-point underdogs and 15-point underdogs win straight up 7% of the time whether the game in played in March or in November.

There are a few notable deviations from this correlation. For example, No. 10 seeds have surprisingly good luck against No. 2 seeds and No. 5 seeds do not upset No. 1 seeds in the Sweet 16 as often as expected. But in general, the correlation is very strong.

As for the correlation between the Vegas points spreads and the point differentials predicted by Kenpom efficiency margins, Figure 2 below shows how strong this correlation is for the first-round games in the 2023 NCAA Tournament.

Figure 2: Correlation between the Vegas lines and the point differentials predicted using Kenpom efficiency margins. The right panel shows an enlarged view of the full data set.

Figure 2 gives me confidence Kenpom efficiencies can be used to model the results of the NCAA Tournament.

2023 Bracket Overview

This year there has been a lot of discussion about parity. The Big Ten had one of the strangest seasons on record which included 11 teams finishing within three games of each other. Parity seems to be present in every corner of college basketball and the overall field appears to be weaker in 2023 than in years past. How will this impact March Madness?

I attempted to explore this question by simulating the results of the 2023 tournament 5,000 times and counting the number of upsets that occurred in each round. I then compared these values to simulations of previous tournaments and to the results of those actual tournaments. The results are shown below in Figure 3.

Figure 3: Number of projected upsets per round of the 2023 NCAA Tournament based on a Monte Carlo Simulation and compared to the historical value and the average of the series of historical simulations.

Figure 3 demonstrates the accuracy of the simulation methodology relative to the actual number of observed upsets. The historical simulations match the observations within about half of an upset each round.

In addition, the results of the 2023 simulation suggest that we will likely see a few more upsets than usual in this year's NCAA Tournament, especially in the first weekend. A typical tournament averages 13 upsets total in the first two rounds. This year's simulation suggests that 14 or more should be expected.

Interestingly, the simulation also gives slightly higher than average odds for upsets in the Sweet 16 and Final Four rounds, but the Regional Final (Elite Eight) round might be a bit quieter.

Speaking of the Final Four, I am also able to use the results of my Monte Carlo simulation to project the distribution of seeds that will advance to the final weekend. Most of the analysts that appear on television will frequently select three or even all four No. 1 seeds to make the Final Four. The reality is quite different.

Figure 4 shows the actual distribution of seeds in the Final Four from 2002 to 2022 (right panel). The left panel shows the distribution that I obtained in my simulation of the 2023 tournament.

Figure 4: Historical seed distribution in the Final Four (left panel) and the results of a simulation of the 2023 tournament (right panel)

History shows that a "typical" Final Four actually is made up of a No. 1 seed, and No. 1 or a No. 2 seed, a No. 2 or No. 3 seed, and one lower seed that averages to a No. 6 seed. More than two No. 1 seeds have made it to the Final Four only five times since seeding began in 1979.

Based on the data from this year's simulation, the Final Four is likely to be made up of more lower seeds than usual. The highest seed is still very likely to be a No. 1 seed, but there is over a 20% chance that none of the No. 1 seeds make it.

The second-best seed is likely a No. 1 or a No. 2 seed, but the odds that it is a No. 3, 4, or No. 5 seed are higher than usual. The third highest seed in 2023 looks to be a toss-up between a No. 2, No. 3, and a No. 4 seed. As for the lowest seed in the 2023, the odds suggest that there is an 85% chance that it is a No. 4 seed or lower.

Region-By-Region Overview

In part one of this series, I summarized the methodology that I use to break down and analyze the NCAA Tournament bracket. I use Kenpom efficiency data to estimate point spreads and I leverage those point spreads to simulate the tournament and to calculate the odds of various outcomes. 

The initial look at the bracket suggests that there will be slightly more upsets than usual in 2023 and the Final Four will very likely include several teams not on the No. 1 seed line. But which upsets will actually take place? Once again, the #math can help. Let's now look at each region in turn and then make a few informed predictions.

South Region

Table 1 below summarizes all of the relevant data for the South Region.

Table 1: 2023 NCAA Tournament South Region odds

This table gives a lot of information that we will use to make our picks. The left side of the table shows the pre-tournament Kenpom adjusted efficiency margin for each team. The shaded cells on the left side of the table provide a comparison of each team's efficiency relative to the historical average of teams of that seed. This provides a simple way to look at the relative strength or weakness of each team and the bracket as a whole.

The middle of the table shows the odds for each team to advance through each round of the tournament. The teams are sorted not by seed, but by the odds for each team to advance to the Final Four. The red or green shaded cells on the far right are the relative odds for each team to advance compared to historical averages for that seed.

Finally, I added a new column for this year labeled "SoD" which stands for "strength of draw."  This calculation starts with the odds for a historically average No. 1 seed to advance to the Final Four from any of the 16 positions on this year's bracket. I then compare those odds to the odds that the same historically average No. 1 seed would have to reach the Final Four in a historically average NCAA Tournament bracket.

Overall, Table 1 shows us that while the top three seeds (Alabama, Arizona, and Baylor) have the best odds to win the region, all three teams are relatively weak compared to a normal No. 1, No. 2, or No. 3 seed. Virginia is also an exceptionally weak No. 4 seed. 

As for possible South Region sleeper, No. 5 San Diego State, and especially No. 6 Creighton, No. 9 West Virginia, and No. 10 Utah State are relatively strong. This is a great hint for where some potential first and second round upsets might occur.

The strength of draw metric (SoD) for all 16 teams in positive, which suggests that the 2023 South Region is relatively weak compared to past tournaments. Once again San Diego State, Creighton, and Utah State grade out with having very good draws relative to typical teams on their seed lines. 

But, overall Alabama still grades out with slightly above average odds to advance to the Final Four, which is the result that I feel is the most likely.

Midwest Region

Table 1 below summarizes all of the relevant data for the Midwest Region.

Table 2: 2023 NCAA Tournament Midwest Region odds.

Similar to the South Region, the Midwest Region also appears to be relatively weak. No. 1 Houston is barely an above average No. 1 seed, which is notable as the Cougars have the best overall odds to win the National Title at 17%. 

However, seeds No. 2 through No. 6 are all very below average. The only relatively above average teams among the single-digit seeds are No. 7 Texas A&M and No. 9 Auburn. As a result, it grades out as an easy draw for Houston, as well as for No. 12 Drake and No. 13 Kent State. 

But, Houston is still essentially a mid-major team who now may be playing without star guard Marcus Sasser. Can they be trusted to win four games in a row and advance to the Final Four? I have my doubts.

In all honesty, No. 16 Northern Kentucky is one of the few other above average seeds in the region. That No. 1 versus No. 16 match-up looks a like a traditional No. 2 versus No. 15 match-up. I certainly would not recommend making this pick in your office bracket... but it is notable.

West Region

Table 3 below summarizes all of the relevant data for the West Region.

Table 3: 2023 NCAA Tournament West Region odds.

As Table 3 shows, the West Region is quite a bit different from the South and the Midwest. The strength of draw data suggest that the West Region is, by far, the toughest region in this tournament. In contrast to the data we have seen so far, almost all of the top 13 seeds in the region are historically above average.

Furthermore, this is a very bad draw for No. 1 Kansas, which is a very below average No. 1 seed. The defending champions are actually the fourth highest ranked team in the region behind No. 2 UCLA, No. 4 UCONN, and No. 3 Gonzaga. I would not bank on a return to the Final Four for the Jayhawks. 

Based on Table 3, No. 2 UCLA has the best odds to advance to the Final Four, but the Bruins are also dealing the loss of a key player (Jaylen Clark) to a season-ending ACL injury. Based on this, UCONN looks like a very promising sleeper Final Four team out of the West.

East Region

Table 4 below summarizes all of the relevant data for the East Region.

Table 4: 2023 NCAA Tournament East Region odds.

For as weak as the South and Midwest Regions appear, the East is actually the weakest region of the four, which is clear from the positive values in the strength of draw column as well as the negative (red) values in the relative Kenpom efficiency column. No. 4 Tennessee is the only historically above average seed in the top seven.

Overall, No. 1 Purdue still holds a slight lead in Final Four odds (24%), but No. 4 Tennessee (22%), and No. 2 Marquette (17%) are not far behind. That said, I have little confidence that any of these teams can win four games in a row in March. This region looks wide open.

As for Michigan State, the Spartans are also a historically below average No. 7 seed. But, they are less below average than most of the teams in the East region, which in 2023 counts as a win. My calculatons say that Michigan State has a 22% chance to reach the Sweet 16, a 4% chance to make the Final Four, and a 1-in-250 chance to win the National Title. 

While these odds are not great, they are above average for a No. 7 seed. Furthermore, the strength of draw data suggests that the Spartans have the easiest path to the Final Four of any of the No. 7 seeds.

First Round Upset Analysis

Tables 1-4 provide a great snapshot of each region, but in any tournament it is the individual match-ups that ultimately matter. In part one of this series, my analysis led me to the conclusion that around 14 upsets are expected between the first and second rounds this year. But which upsets are the most likely? Figure 1 below helps to answer that question.

Figure 1: Odds for the higher seeded teams to win for each seed pairing, relative to the historical odds (shown in blue) for all first round games.

For my money, the data in Figure 1 is the most useful when filling out my bracket. The blue line in both panels shows the historical odds for the higher-seeded team to win each match-up. The labeled data points show the actual odds, based on Kempon efficiency data (which accurately mirrors the Vegas spread). 

If a data point falls below the line, an upset is more likely than average. If a data point is above the line, an upset is less likely than average. The farther a data point is from the line, the more likely the upset. 

In general, first round games involving No. 8 through No. 10 seeds are a coin flip. Figure 1 informs us that the coin in question is slightly weighted and that the most likely upsets here are:
  • No. 10 Utah State over No. 7 Missouri
  • No. 10 Boise State over No. 7 Northwestern
  • No. 9 Auburn over No. 8 Iowa
  • No. 9 West Virginia over No. 8 Maryland
My "upset predicting algorithm" also suggests taking: 
  • No. 9 Florida Atlantic over No. 8 Memphis. 
Michigan State does fall very slightly below the line in Figure 1, but the difference is not large enough to trigger a upset alert for the Spartans.

As for the No. 6 seeds, while there is usually one or two upsets per year on this seed line, my analysis does not clearly suggest any upsets here. The No. 11 seeds are generally weak. That said, the most likely potential upset on the board would be for No. 11 Providence to upset No. 6 Kentucky or for the winner of the play-in game between No. 11 Arizona State and No. 11 Nevada to upset No. 6 TCU.

As for the iconic No. 12 over No. 5 upset, this year's bracket had one such matchup that is an obvious upset pick based on Figure 1:
  • No. 12 Drake over No. 5 Miami
If one is looking for a second upset pick here, No. 12 Oral Roberts over No. 5 Duke is very compelling.

The left panel of Figure 1 focuses on the potential upsets that are less likely. That said there is usually one or two of these "big" upsets in any given tournament. My algorithm only recommends one upset in this panel:
  • No. 13 Kent State over No. 4 Indiana
As an added bonus, the result of my analysis would set up a second round match-up between two double-digit seeds (No. 12 Drake and No. 13 Kent State) which has happened a total of 12 times since 1991. No. 13 Furman over No. 4 Virginia is very tempting as a another potential big first round upset.

Also note that my computer is not predicting a strong showing from the Big Ten. Four of my seven first round upsets involve Big Ten teams.

As for the  No. 3 seeds and above, upsets to these team seem to be unpredictable by nature . That said, my eye continues to be drawn to the No. 15 Vermont versus No. 2 Marquette match-up. Marquette is still a heavy favorite but the spread is closer to what would be expected for a No. 3 versus a No. 14 seed.

Second Round Upsets

Moving onto the second round and beyond, Figure 2 shows a similar analysis which for consistency assumes that the favorites all win in the first round.

Figure 2: Odds for the higher seeded teams to win for each seed pairing, relative to the historical odds (shown in blue) for all first round games.

For second round action Figure 2 and my upset algorithm recommend several upsets. The No. 1 seeds and the No. 3 seeds appear to be particularly vulnerable in the round of 32. To this end, my second round upset picks are (in order of likelihood):
  • No. 5 San Diego State over No. 4 Virginia
  • No. 6 Crieghton over No. 3 Baylor
  • No. 6 Kentucky over No. 3 Kansas State
  • No. 8 Arkansas over No. 1 Kansas
  • No. 6 Iowa State over No. 3 Xavier
  • No. 10 Utah State over No. 2 Arizona
  • No. 8 Florida Atlantic over No. 1 Purdue
If I count up the total number of upset so far, I have selected a total of 14 which is the number that my simulation suggests is the most likely for the first and second rounds combined. 

It is worth noting that my computer does not support the very trendy pick of No. 5 Duke to make the Sweet 16. The metrics strongly favor No. 4 Tennessee, although the Volunteers are also dealing with an injury to star point guard Zakai Zeigler, who is out for the season.

That said, the next upset on list for the second round is for No. 7 Michigan State to upset No. 2 Marquette. I am likely going to add that one to my bracket, but I will leave that one to you and your conscience. 

Sweet 16 and Beyond

If I make assumptions above, I project that the tournament would play out as follows from the Sweet 16, using a combination of math and intuition.

In the South Region, I have No. 1 Alabama defeating No. 5 San Diego State and No. 6 Creighton taking out No. 10 Utah State. As stated above, the metrics favor No. 1 Alabama advancing to the Final Four.

In the Midwest Region, I have No. 1 Houston ending the Cinderella run of No. 12 Drake in the Sweet 16. In the other game I have No. 2 Texas defeating No. 6 Iowa State. In the regional final, I am going to go with the mild upset placing No. Texas in the Final Four.

In the wild West Region, I have No. 4 UCONN eliminating No. 8 Arkansas and then facing No. 3 Gonzaga, fresh off of the mild upset of No. 2 UCLA. The metrics then tell me to take No. 4 UCONN to the final weekend.

In the East Region, I have No. 4 Tennessee defeating No. 9 Florida Atlantic in the top half of the bracket. The other contest features No. 6 Kentucky versus No. 7 Michigan State. 

The metrics suggest that Kentucky would be about a one-point favorite in that contest. My gut is telling me that the Sweet 16 is likely the ceiling for this specific Michigan State team. As a result, I have No. 6 Kentucky advancing and then upsetting No. 4 Tennessee in the regional final.

This leave a Final Four of No. 1 Alabama versus No. 6 Kentucky and No. 2 Texas versus No. 4 UCONN. The metrics suggest here wins by Alabama and UCONN, who I have in the Finals. In that game, I will take UCONN, because if noting else it does not feel like a year when the No. 1 overall seed hoists the trophy.

As a final note, one compelling aspect of this analysis is that three of my Final Four teams finished the season ranked in the top six in Kenpom (which is a historical indicator of national champion contenders): Alabama (3), UCONN (4), and Texas (6). The remaining top six teams, Houston (1), UCLA (2), and Tennessee (5), all have significant injuries to key members of their roster.

This bracket also has a lower seed blue-blood in No. 6 Kentucky in the final weekend, which is just a trope that seems to happen often. That said, it is pretty easy in the minds of Michigan State fans to imagine that the lower seeded Final Four interloper could, in fact, be the Spartans. 

The analysis above suggests that Michigan State is only a slightly optimistic pick for the Sweet 16 and that is assuming that Kentucky and/or Marquette do not have first round meltdown. Even if the Spartans were to face No. 3 Kansas State, the metrics suggest the Wildcats would only be favored by about one point. 

For there, my analysis suggests that Michigan State could face a banged-up No. 4 Tennessee team with a head coach (Rick Barnes) who literally hasthe worst record of tournament under-performance in the modern era.

If you are looking for a reason to be optimistic, I leave you with that.

Enjoy the madness.

Comments

Popular posts from this blog

Dr. Green and White Helps You Fill Out Your Bracket (2024 Edition)

For as long as I can remember, I have loved the NCAA Basketball Tournament. I love the bracket. I love the underdogs. I love One Shining Moment. I even love the CBS theme music. As a kid I filled out hand-drawn brackets and scoured the morning newspaper for results of late night games. As I got older, I started tracking scores using a increasing complex set of spreadsheets. Over time, as my analysis became more sophisticated, I began to notice certain patterns to the Madness I have found that I can use modern analytics and computational tools to gain a better understanding of the tournament itself and perhaps even extract some hints as to how the tournament might play out. Last year, I used this analysis to correctly predict that No. 4 seed UConn win the National Title in addition to other notable upsets. There is no foolproof way to dominate your office pool, but it is possible to spot upsets that are more likely than others and teams that are likely to go on a run or flame out early.

The Case for Optimism

In my experience there are two kinds of Michigan State fans. First, there are the pessimists. These are the members of the Spartan fan base who always expect the worst. Any amount of success for the Green and White is viewed to be a temporary spat of good luck. Even in the years when Dantonio was winning the Rose Bowl and Izzo was going to the Final Four, dark times were always just around the bend. Then, there are the eternal optimists. This part of the Spartan fan base always bets on the "over." These fans expect to go to, and win, and bowl games every year. They expect that the Spartans can win or least be competitive in every game on the schedule. The optimists believe that Michigan State can be the best Big Ten athletic department in the state. When it comes to the 2023 Michigan State football team, the pessimists are having a field day. A major scandal, a fired head coach, a rash of decommitments, and a four-game losing streak will do that. Less than 24 months after hoi

2023 Final Playoff and New Year's Six Predictions

The conference championships have all been played and, in all honesty, last night's results were the absolute worst-case scenario for the Selection Committee. Michigan and Washington will almost certainly be given the No. 1 and No. 2 seed and be placed in the Sugar Bowl and the Rose Bowl respectively. But there are four other teams with a reasonable claim on the last two spots and I have no idea what the committee is going to do. Florida State is undefeated, but the Seminoles played the weakest schedule of the four candidates and their star quarterbac (Jordan Travis) suffered a season ending injury in the second-to-last game of the regular season. Florida State is outside of the Top 10 in both the FPI and in my power rankings. I also the Seminoles ranked No. 5 in my strength of record metric, behind two of the other three candidates. Georgia is the defending national champions and were previously ranked No. 1 coming into the week. But after losing to Alabama in the SEC Title game,