Skip to main content

Dr. Green and White helps you fill out your bracket

For college basketball junkies, this is the best week of the year. Selection Sunday through the end of the first weekend of NCAA basketball tournament is the greatest eight-day span on the sports calendar, and 2022 marks the first time since 2019 that the NCAA Tournament will be "back to normal."

But even casual fans of college basketball go through annual ritual of filling out a bracket in an attempt to predict which crazy upsets will occur during the first two days, which teams will make the Final Four, who will finally cut down the nets on that first Monday of April and everything in-between.

While everyone has their own methods and strategies for picking which teams will advance, I have developed my own system over the years that uses a combination of math and historical probabilities. My method is certainly not foolproof, but it does provide some useful tips that have led to some office pool success over the years. 

It helped me to virtually nail the Final Four in 2019 and it correctly predicted that No. 3 Texas would not survive their first-round test last year against No. 14 Abeline Christian. This year, I have crunched the numbers once again and I am happy to share the results with the class.

Methodology Overview

Last year, I presented a more detailed overview of my methodology. Briefly, I made a simple observation several years ago which forms the foundation for the analysis that I am about the present. That observation is:

When it comes to NCAA Tournament upsets, the behavior is exactly the same as in regular season games. The odds are largely predictable based on Vegas points spreads and by tools that can predict point spreads, such as Kenpom efficiency margin data.

All of my analysis of college basketball odds is based on this same premise. Kenpom efficiency data can be used to assign probabilities to any arbitrary basketball match-up. Knowing this, the full season and any tournament can be mathematically modeled and its odds can be calculated.

My favorite plot to highlight this fact is shown below.

Figure 1: Correlation between NCAA Tournament upsets and the odds predicted using Kenpom efficiency data.

The figure summarizes the upset frequencies of some of the most common seed pairings that occur in the NCAA Tournament. As we can see, the actual frequencies correlate extremely well with what we would expect based on the win probabilities derived from either the actual Vegas point spreads or Kenpom efficiency data.

Just in case there is still some doubt about the value of using Kenpom data to project point spreads, Figure 2 shows the current correlation for all of the first-round games, based on Tuesday's lines as published on Draft Kings. Note that the left-hand panel is the full set of data, while the right-hand panel is an enlarged view of the data for the games where the spread is 10 points or less.

Figure 2: Correlation between the Vegas lines and the point differentials predicted using Kenpom efficiency margins. The right panel shows an enlarged view of the full data set.

As we can see, the correlation is very strong, with only a handful of games differing by more than a point or two.

Upset Picks

A careful analysis of Figure 2 will already start to give some hints as to where some of the more likely upsets will occur. Are there any pairings above that look to have a tighter spread than one might expect for that seed pairing? Naturally, those are the games to put on upset alert.

A better way to visualize these upset odds are the plot them as a group relative to each other and to the historical odds of an upset for that particular pairing. Figure 3 shows this analysis for the full set of first round games.

Figure 3: Odds for the higher seeded teams to win for each seed pairing, relative to the historical odds (shown in blue) for all first round games

Using this figure, it is easy to see where the most likely upsets will occur. If a game falls below the blue line (the historical odds that the higher seeded team wins) an upset is more probable. If a game is above the blue line, it is less probable. That said, the odds shown at the right are still the "true" odds for the upset.

The left side of Figure 3 shows the data for the top four seeds in each region. As a general rule, there is only one or two "major" upset of this nature in any given tournament. There were a total of four of these upsets in 2021, but that was the most in Tournament history. That said, there have been only five tournaments since 1985 where all of the top four seeds have survived the first round (1994, 2000, 2004, 2007, and 2017). Therefore, it is quite likely that at least one of those teams will fall.

An upset to No. 1 or No. 2 seeds are quite rare and unpredictable. Based on Figure 3, I would not expect one in 2022. But there are a total of four possible upsets on the No.3 and No. 4 lines that stick out from Figure 3. No.3 Wisconsin, No. 4 Illinois, No. 4 Arkansas, and especially No. 4 Providence are all more likely to be upset than a typical team of their seed, based on historical trends.

If we dig into the numbers, the reasons are clear. In the 2022 tournament, the No. 4 seeded teams appear to be weaker than normal, historically, while the No. 13 seeds are stronger. This is a classic recipe for an upset.

As for Wisconsin, it is not that Colgate is a particularly strong No. 14 seed. In fact, they are a below average No. 14 seed. The problem is that Wisconsin is a historically poor No. 3 seed, based on their current Kenpom efficiency margin. In my analysis of the Big Ten season, Wisconsin consistently graded out as the luckiest team in the Big Ten. Do not be surprised if they do not stay for long in the Big Dance.

The right side of Figure 3 shows the data for the first-round games involving teams seeded No. 5 to No. 12. This is where the bulk of the upsets (relative to seeding) occur in any given tournament. Once again, the Figure gives insight into which upsets are more probable.

Interestingly, while No. 4 seeds seem to be in quite a bit of danger in the first round, the No. 5 seeds look fairly safe. Only St. Mary's College -- who will play the winner of the Indiana / Wyoming game in the play-in round -- looks somewhat vulnerable. Considering that the winner of the play-in game tends to have success in the first round and the fact that at least one No. 5 seed has lost in the first round of 31 of the past 36 tournaments, this might be a good bet.

As the blue line in Figure 3 shows, teams seeded No. 6 or No. 7 tend to get upset 40 to 45 percent of the time. These games are typically close to toss-ups. In 2022, five of those eight contests look riper than usual for an upset. Warning, the conclusions here are not great for Michigan State fans.

Based on this analysis, No. 7 Michigan State is officially on upset alert versus No. 10 Davidson. Furthermore, No. 11 Michigan also stands a very good shot to "upset" No. 6 Colorado State. As Figure 2 shows, the Wolverines are actually favored in this game. My metrics suggest an upset pick is in order in both cases. You, dear reader, will simply need to pick with your conscience.

As for the other potential upsets, Kenpom has No. 10 Loyola-Chicago favored over No. 7 Ohio State and No. 10 San Fransisco favored over Murray State. Those are both good bets. No. 6 Texas also looks vulnerable against No. 11 Virginia Tech.

For the No. 8 and No. 9 seed games, these are historically true toss-up games, as Figure 3 suggests. In this case, it is best to consult the Vegas line, which currently has No. 9 Memphis favored over Boise State and which has No. 9 TCU as a pick'em versus No. 8 Seton Hall. Those are the two most likely "upsets" in that group of four games.

Vegas spreads are a useful tool for the first-round games. However, they are not available for any games in subsequent rounds. Fortunately, Kenpom data can be used to project these lines and win probabilities, which still allows for the further analysis as shown bleow in Figure 4.

Figure 4: Odds for the higher seeded teams to win for each seed pairing, relative to the historical odds (shown in blue) for rounds two to four, based on Kenpom efficiency margin data.

The left side of Figure 4 compares the odds for the higher seeds to win in the second round of the tournament.

No. 1 seeds get bounced in the second round roughly every-other year, on average. Illinois experienced that in 2021, but all the No. 1 seeds advanced in 2019. Based on Figure 4, the most vulnerable No. 1 seed is Kansas, but there is still a 70 percent chance that the Jayhawks survive until the second weekend.

In general, about two-thirds of all No. 2 seeds advance to the Sweet 16 and it is quite rare that all four survive the first weekend in any given tournament. That said, there is no clear No. 2 seed that appears vulnerable in 2022. I can think of one No. 2 seed that I would like to see lose to a certain Green and White-clad No. 7 seed (if they make it that far) but I will save that analysis for later in the week.

As for the No. 3 and No. 4 seeds, history tells us that about half of them will likely advance to the Sweet 16. Figure 4 gives some strong hints as to which of these seeds are more likely to be upset, and the news is not great for some Big Ten fans.

Interestingly, there are four potential second round games involving No. 3 or No. 4 seeds where the lower seeded team are projected to be favored in the second round. No. 6 LSU would be favored over No. 3 Wisconsin, No. 4 Houston is projected to be favored over No. 5 Illinois, and No. 5 UCONN is a likely pick'em versus No. 4 Arkansas. Furthermore, No. 5 Iowa is projected to be a big favorite over No. 4 Providence. 

Naturally, this assumes that all of the No. 4 seeds win their first round match-ups, which Figure 3 suggests is not likely. Also note that No. 3 seed Purdue may be slightly vulnerable to No. 6 Texas, if the Longhorns can beat Virginia Tech in the first round.

Finally, the right side of Figure 4 presents the same data for the potential match-ups in the Sweet 16 and Elite Eight. This analysis does assume that the top seeds all advance, which is unlikely, but it does provide some hints as to which teams are more or less likely to advance to the Final Four.

For example, No. 1 Baylor looks potentially vulnerable to No. 4 UCLA in the Sweet 16, and even more vulnerable in a potential Regional Final showdown with No. 2 Kentucky. Figure 4 also suggests that No. 3 Tennessee and No. 3 Texas Tech both might be favored over No. 2 Villanova and No. 2 Duke in their respective regions. No. 1 Kansas would also be only a slight favorite over No. 2 Auburn in a potential Midwest Regional Final.

The analysis above will hopefully provide a good start in filling out a bracket. But, exactly how many upsets should we expect? How is each individual Region likely to shake out? Stay tuned for Part Two, coming soon.

The time has almost come for the Madness to begin once again. On Thursday at 12:15 p.m., the first round of the NCAA Tournament will tip off. But, before you fill out your bracket, I have some math-based tips to help you along. 

On Tuesday, I broke down my method to identify how to spot the most likely upsets on the bracket. This analysis will go a long way toward making some informed office pool decisions. Today, in part two of this series, I wanted to approach the analysis from the point of view of each region and the tournament as a whole. In other words, yesterday I identified the pieces, and today I will tell you how to assemble them.

Before we dive into each region, there are two other figures that will add context to the analysis. The first one shows the average number of upsets to expect in each round and overall.

Figure 1: Number of projected upsets per round of the 2022 NCAA Tournament based on a Monte Carlo Simulation and compared to the historical value and the average of the series of historical simulations.

Figure 1 shows three sets of data (two simulations and one actual measurement). The blue bar shows the number of upsets projected in my simulation of the 2022 Tournament. The red bars show the average number of upsets per round in the set of simulations that I have performed on the last 19 tournaments. The green bar shows the actual average number of upsets back to 1985 when the tournament expanded to 64 teams.

As we can see, all three sets of data show the same trend. The fact that the simulated results match the actual results so well is another indicator that my methodology is robust. That said, Figure 1 also gives the standard deviations, which suggests that there still is a lot potential variance in these numbers. 

While there most likely will be between eight and nine first round upsets, there is only a two-thirds chance that there will be between six and 11 upsets. There is also only a five percent chance of either more than 13 upsets or less than four.

The second piece of data had to do with the make-up of the Final Four. In the typical Selection Sunday prediction shows, it is common for the analysts to make a very "chalk-like" prediction of up to three No. 1 seeds in the final weekend. But, in reality, this rarely happens. Figure 2 gives the actual average make-up of the Final Four, based on historical data.

Figure 2: Historical make-up of the Final Four, based on seeds.

The data here is best summarized by looking at the distribution of the highest seed, the second highest seed, the third seed, and the lowest seed. As we can see, there is almost always at least one No. 1 seed, but the odds of two No. 1 seeds in the Final Four are just above 50-50. The third highest seed is most often a No. 2 seed, but the lowest seed is almost always a No. 3 seed or lower. 

The 2021 Tournament was a perfect example, as No. 1 Gonzaga, No. 1 Baylor, No. 2 Houston, and No. 11 UCLA survived the final weekend.

Region by Region Analysis

Let's now move on to look at each Region in more detail, starting in the west. In each case, I will present a table of data that summarizes each teams' odds to advance through each round of the Tournament, based on the projected point spreads for any possible tournament match-up. 

In addition, each table contains a block of data that compares each teams' Kenpom efficiency and round-by-round odds to that of a historically average team with the same seed. This gives a clear indication of the relative strength or weakness of each team in each region.

West Region

Table 1: 2022 NCAA Tournament West Region Odds

With overall top seed and tournament favorite Gonzaga sitting on the top of the bracket, the West is the hardest of the four regions. The Zags grade out with over a 50 percent chance to advance to the Final Four.

At a glance, Table 1 reveals that only No. 1 Gonzaga, No. 3 Texas Tech, No. 9 Memphis, No. 13 Vermont, and No. 16 Georgia State are stronger seeds than the historical averages. As a result, No. 3 seed Texas Tech checks in with the second-best odds to win the region. It is also notable that No. 5 UCONN has better odds to reach the Final Four than No. 4 Arkansas.

As for No. 7 Michigan State, the Spartan's first game with No. 10 Davidson is a virtual toss-up and Michigan State's odds to advance to the Sweet 16 (by most likely ending Coach K's career at No. 2 Duke) are about one-in-six. The Spartans have about a five percent chance to make the Regional Final and only a one percent chance to reach Coach Izzo's ninth Final Four. Michigan State's National Title odds are slightly better than 1-in-1,5000.

In the first two rounds of the West Region, my metrics suggest the following upsets:
  • No. 10 Davidson over No. 7 Michigan State (don't @ me. It's just #math)
  • No. 9 Memphis over No. 8 Boise State.
  • No. 5 UCONN over No. 4 Arkansas 
This would set up the following Sweet 16 match-ups:
  • No. 1 Gonzaga versus No. 5 UCONN
  • No. 2 Duke versus No. 3 Texas Tech
Based on the Table above, Texas Tech would likely be favored to beat Duke and advance to the Regional Final, where they would meet (and I project lose to) No. 1 Gonzaga.

As a final note on the West, if you are looking for a potential Cinderella to make a surprising run to Sweet 16, No. 13 Vermont looks like the best bet.

South Region

The odds table for the South Region is shown below in Table 2.

Table 2: 2022 NCAA Tournament South Region Odds.

Overall, the South Region is in a virtual dead heat with the East Region as the second most challenging bracket. No. 1 Arizona is the favorite to win the region with odds of just below 30 percent, but No. 3 Tennessee, No. 5 Houston, and No. 2 Villanova all have odds between 16 and 21 percent. Note that No. 4 Illinois' Final Four odds are remarkably small at just four percent.

As for first- and second-round upsets, the data suggests the following:
  • No. 10 Loyola over No. 7 Ohio State
  • No. 9 TCU over No. 8 Seton Hall
  • No. 11 Michigan over No. 6 Colorado State 
  • No. 5 Houston over No. 4 Illinois
  • No. 10 Loyola over No. 2 Villanova
Note that this analysis does not consider some of the injury information coming out of Ann Arbor which could certainly impact that game considerably. Either way, I project that No. 11 Michigan would lose in the second round to No, 3 Tennesse even if they can survive the first round.

My analysis yesterday did not make any clear recommendations for a second-round exit from a No. 2 seed, but the potential match-up between Loyola and Villanova has the best odds and is therefore my pick. This also implies that the Ramblers are my double-digit Cinderella picks for the South Region.

This would set up the following Sweet 16 match-ups:
  • No. 1 Arizona versus No. 5 Houston
  • No. 3 Tennessee versus No. 10 Loyola
My analysis of the South Region Sweet 16 has Tennessee advancing to the Region Final, but it also loves Houston's chances against Arizona. Houston finished the season ranked No. 4 in Kenpom and as a result, the method that I use has the Cougars beating both Arizona and Tennessee to make their second consecutive Final Four.

That said, Houston is also dealing with some injury issues and I am not sure that my analysis can be completely trusted. Thus, I am calling an audible on my own analysis. I will take No. 1 Arizona to beat No. 5 Houston, but I will also pick No. 3 Tennessee to win the region. It is historically unlikely to see the top two overall seeds advance to the Final Four and this pick just feels right.

Midwest Region

The odds table for the Midwest Region is shown below in Table 3.

Table 3: 2022 NCAA Tournament Midwest Region Odds. 

The Midwest Region grades out as the easiest of the four regions and at a glance, Table 3 gives a hint as to why. The top four seeds in the region are all historically below average. As a result, No. 1 Kansas still has the best odds to advance to the Final Four at 29 percent. No. 2 Auburn is right behind the Jayhawks at 28 percent. Notably, No. 5 Iowa has the third best odds at 19 percent.

As for first- and second-round upsets, the data suggests the following:
  • No. 13 South Dakota State over No. 4 Providence
  • No. 6 LSU over No. 3 Wisconsin
That said, if a No. 1 seed is to fall in the second round, No. 8 San Diego State upsetting No. 1 Kansas seems to be the most likely. I am not making that prediction, but it is tempting. 

I then have the Midwest Sweet 16 as follows:
  • No. 1 Kansas versus No. 5 Iowa
  • No. 2 Auburn versus No. 6 LSU
If I follow my methodology to the letter, the numbers tell me that No. 5 Iowa is going to upset both No. 1 Kansas and No. 2 Auburn to reach their first Final Four since 1980. That said, I just can't bring myself to believe that Hawkeye's and their 78th ranked defense is strong enough to win four Tournament games in a row. 

Therefore, my official pick to win the Midwest just reverts to the team with the best odds, which is No. 1 Kansas. I don't feel great about that, but it is what it is. 

Also note that the best bet for a double-digit Cinderella in the Midwest is No. 13 South Dakota State.

East Region

Finally, the odds table for the East Region is shown below in Table 4.

Table 4: 2022 NCAA Tournament Midwest Region Odds.

In this case, the East Region appears to have a somewhat weak No. 1 seed in Baylor, but strong No. 2 seed in Kentucky as well as a strong No. 4 seed in UCLA. These three teams all have between an 18 percent and a 27 percent chance to win the region.

My first- and second-round upsets picks in the East are as follows:
  • No. 10 San Francisco over No. 7 Murray State 
  • No. 11 Virginia Tech over No. 6 Texas
No. 11 Virginia Tech would also be a potential upset pick winner over No. 3 Purdue in the rounds of 32. For this reason, the Hokies are the most likely double-digit Cinderella team in the East Region. I am also tempted to take Indiana over Saint Mary's as my only No. 5 seed to lose in the first round, but I am going to go out on a limb and project that all four No. 5 seeds will advance.

With all chalk in the second round, we are left with the following Sweet 16 match-ups:
  • No. 1 Baylor versus No. 4 UCLA
  • No. 2 Kentucky versus No. 3 Purdue
In this case, my methodology clearly projects that UCLA will upset Baylor, but that Bruins will fail to advance to their second consecutive Final Four when they face No. 2 Kentucky, who I am picking to win the East.

Final Analysis

Based on this analysis, I am projecting a Final Four consisting of the following teams:

  • No. 1 Gonzaga versus No. 2 Kentucky
  • No. 2 Kansas versus No. 3 Tennessee
The metrics strongly favor Gonzaga, as the Tables above suggest. The Zags do have the best odds to win the National Title. That said, I am going to go with a final of No. 2 Kentucky over No. 1 Kansas. Your mileage may vary.

As a final note of justification of this projection, I will comment that 17 of the past 20 National Champions have entered the NCAA Tournament ranked in the top six of Kenpom's adjusted efficiency margin. The current top six are, in order:
  1. Gonzaga (West No. 1 seed)
  2. Arizona (South No. 1 seed)
  3. Kentucky (East No. 2 seed)
  4. Houston (South No. 5 seed)
  5. Baylor (East No. 1 seed)
  6. Kansas (Midwest No. 1 seed)
Two of those teams are from mid-major conferences (Gonzaga and Houston) where the level of competition casts some doubt as to whether the metrics can be used to fairly evaluate the teams. Last year's Final Four, where Baylor blew out both teams still sticks in my mind in this regard.

Furthermore, Baylor has some injuries issues, while Arizona comes from a weaker league and has a first-year head coach. These facts alone are enough to give me pause to pick either team to win it all. That simply leaves us with Kentucky and Kansas, my picks for the National Title game.

That is all this year's analysis. Embrace the madness and Go Green.

Comments

Popular posts from this blog

Dr. Green and White Helps You Fill Out Your Bracket (2024 Edition)

For as long as I can remember, I have loved the NCAA Basketball Tournament. I love the bracket. I love the underdogs. I love One Shining Moment. I even love the CBS theme music. As a kid I filled out hand-drawn brackets and scoured the morning newspaper for results of late night games. As I got older, I started tracking scores using a increasing complex set of spreadsheets. Over time, as my analysis became more sophisticated, I began to notice certain patterns to the Madness I have found that I can use modern analytics and computational tools to gain a better understanding of the tournament itself and perhaps even extract some hints as to how the tournament might play out. Last year, I used this analysis to correctly predict that No. 4 seed UConn win the National Title in addition to other notable upsets. There is no foolproof way to dominate your office pool, but it is possible to spot upsets that are more likely than others and teams that are likely to go on a run or flame out early.

The Case for Optimism

In my experience there are two kinds of Michigan State fans. First, there are the pessimists. These are the members of the Spartan fan base who always expect the worst. Any amount of success for the Green and White is viewed to be a temporary spat of good luck. Even in the years when Dantonio was winning the Rose Bowl and Izzo was going to the Final Four, dark times were always just around the bend. Then, there are the eternal optimists. This part of the Spartan fan base always bets on the "over." These fans expect to go to, and win, and bowl games every year. They expect that the Spartans can win or least be competitive in every game on the schedule. The optimists believe that Michigan State can be the best Big Ten athletic department in the state. When it comes to the 2023 Michigan State football team, the pessimists are having a field day. A major scandal, a fired head coach, a rash of decommitments, and a four-game losing streak will do that. Less than 24 months after hoi

2023 Final Playoff and New Year's Six Predictions

The conference championships have all been played and, in all honesty, last night's results were the absolute worst-case scenario for the Selection Committee. Michigan and Washington will almost certainly be given the No. 1 and No. 2 seed and be placed in the Sugar Bowl and the Rose Bowl respectively. But there are four other teams with a reasonable claim on the last two spots and I have no idea what the committee is going to do. Florida State is undefeated, but the Seminoles played the weakest schedule of the four candidates and their star quarterbac (Jordan Travis) suffered a season ending injury in the second-to-last game of the regular season. Florida State is outside of the Top 10 in both the FPI and in my power rankings. I also the Seminoles ranked No. 5 in my strength of record metric, behind two of the other three candidates. Georgia is the defending national champions and were previously ranked No. 1 coming into the week. But after losing to Alabama in the SEC Title game,