For college basketball junkies, this is the best week of the year. Selection Sunday through the end of the first weekend of NCAA basketball tournament is the greatest eight-day span on the sports calendar, and 2022 marks the first time since 2019 that the NCAA Tournament will be "back to normal."
But even casual fans of college basketball go through annual ritual of filling out a bracket in an attempt to predict which crazy upsets will occur during the first two days, which teams will make the Final Four, who will finally cut down the nets on that first Monday of April and everything in-between.
While everyone has their own methods and strategies for picking which teams will advance, I have developed my own system over the years that uses a combination of math and historical probabilities. My method is certainly not foolproof, but it does provide some useful tips that have led to some office pool success over the years.
It helped me to virtually nail the Final Four in 2019 and it correctly predicted that No. 3 Texas would not survive their first-round test last year against No. 14 Abeline Christian. This year, I have crunched the numbers once again and I am happy to share the results with the class.
Methodology Overview
Last year, I presented a more detailed overview of my methodology. Briefly, I made a simple observation several years ago which forms the foundation for the analysis that I am about the present. That observation is:
When it comes to NCAA Tournament upsets, the behavior is exactly the same as in regular season games. The odds are largely predictable based on Vegas points spreads and by tools that can predict point spreads, such as Kenpom efficiency margin data.
All of my analysis of college basketball odds is based on this same premise. Kenpom efficiency data can be used to assign probabilities to any arbitrary basketball match-up. Knowing this, the full season and any tournament can be mathematically modeled and its odds can be calculated.
My favorite plot to highlight this fact is shown below.
Figure 1: Correlation between NCAA Tournament upsets and the odds predicted using Kenpom efficiency data. |
The figure summarizes the upset frequencies of some of the most common seed pairings that occur in the NCAA Tournament. As we can see, the actual frequencies correlate extremely well with what we would expect based on the win probabilities derived from either the actual Vegas point spreads or Kenpom efficiency data.
Just in case there is still some doubt about the value of using Kenpom data to project point spreads, Figure 2 shows the current correlation for all of the first-round games, based on Tuesday's lines as published on Draft Kings. Note that the left-hand panel is the full set of data, while the right-hand panel is an enlarged view of the data for the games where the spread is 10 points or less.
Figure 2: Correlation between the Vegas lines and the point differentials predicted using Kenpom efficiency margins. The right panel shows an enlarged view of the full data set. |
As we can see, the correlation is very strong, with only a handful of games differing by more than a point or two.
Upset Picks
A careful analysis of Figure 2 will already start to give some hints as to where some of the more likely upsets will occur. Are there any pairings above that look to have a tighter spread than one might expect for that seed pairing? Naturally, those are the games to put on upset alert.
A better way to visualize these upset odds are the plot them as a group relative to each other and to the historical odds of an upset for that particular pairing. Figure 3 shows this analysis for the full set of first round games.
Figure 3: Odds for the higher seeded teams to win for each seed pairing, relative to the historical odds (shown in blue) for all first round games |
Using this figure, it is easy to see where the most likely upsets will occur. If a game falls below the blue line (the historical odds that the higher seeded team wins) an upset is more probable. If a game is above the blue line, it is less probable. That said, the odds shown at the right are still the "true" odds for the upset.
The left side of Figure 3 shows the data for the top four seeds in each region. As a general rule, there is only one or two "major" upset of this nature in any given tournament. There were a total of four of these upsets in 2021, but that was the most in Tournament history. That said, there have been only five tournaments since 1985 where all of the top four seeds have survived the first round (1994, 2000, 2004, 2007, and 2017). Therefore, it is quite likely that at least one of those teams will fall.
An upset to No. 1 or No. 2 seeds are quite rare and unpredictable. Based on Figure 3, I would not expect one in 2022. But there are a total of four possible upsets on the No.3 and No. 4 lines that stick out from Figure 3. No.3 Wisconsin, No. 4 Illinois, No. 4 Arkansas, and especially No. 4 Providence are all more likely to be upset than a typical team of their seed, based on historical trends.
If we dig into the numbers, the reasons are clear. In the 2022 tournament, the No. 4 seeded teams appear to be weaker than normal, historically, while the No. 13 seeds are stronger. This is a classic recipe for an upset.
As for Wisconsin, it is not that Colgate is a particularly strong No. 14 seed. In fact, they are a below average No. 14 seed. The problem is that Wisconsin is a historically poor No. 3 seed, based on their current Kenpom efficiency margin. In my analysis of the Big Ten season, Wisconsin consistently graded out as the luckiest team in the Big Ten. Do not be surprised if they do not stay for long in the Big Dance.
The right side of Figure 3 shows the data for the first-round games involving teams seeded No. 5 to No. 12. This is where the bulk of the upsets (relative to seeding) occur in any given tournament. Once again, the Figure gives insight into which upsets are more probable.
Interestingly, while No. 4 seeds seem to be in quite a bit of danger in the first round, the No. 5 seeds look fairly safe. Only St. Mary's College -- who will play the winner of the Indiana / Wyoming game in the play-in round -- looks somewhat vulnerable. Considering that the winner of the play-in game tends to have success in the first round and the fact that at least one No. 5 seed has lost in the first round of 31 of the past 36 tournaments, this might be a good bet.
As the blue line in Figure 3 shows, teams seeded No. 6 or No. 7 tend to get upset 40 to 45 percent of the time. These games are typically close to toss-ups. In 2022, five of those eight contests look riper than usual for an upset. Warning, the conclusions here are not great for Michigan State fans.
Based on this analysis, No. 7 Michigan State is officially on upset alert versus No. 10 Davidson. Furthermore, No. 11 Michigan also stands a very good shot to "upset" No. 6 Colorado State. As Figure 2 shows, the Wolverines are actually favored in this game. My metrics suggest an upset pick is in order in both cases. You, dear reader, will simply need to pick with your conscience.
As for the other potential upsets, Kenpom has No. 10 Loyola-Chicago favored over No. 7 Ohio State and No. 10 San Fransisco favored over Murray State. Those are both good bets. No. 6 Texas also looks vulnerable against No. 11 Virginia Tech.
For the No. 8 and No. 9 seed games, these are historically true toss-up games, as Figure 3 suggests. In this case, it is best to consult the Vegas line, which currently has No. 9 Memphis favored over Boise State and which has No. 9 TCU as a pick'em versus No. 8 Seton Hall. Those are the two most likely "upsets" in that group of four games.
Vegas spreads are a useful tool for the first-round games. However, they are not available for any games in subsequent rounds. Fortunately, Kenpom data can be used to project these lines and win probabilities, which still allows for the further analysis as shown bleow in Figure 4.
Figure 4: Odds for the higher seeded teams to win for each seed pairing, relative to the historical odds (shown in blue) for rounds two to four, based on Kenpom efficiency margin data. |
The left side of Figure 4 compares the odds for the higher seeds to win in the second round of the tournament.
No. 1 seeds get bounced in the second round roughly every-other year, on average. Illinois experienced that in 2021, but all the No. 1 seeds advanced in 2019. Based on Figure 4, the most vulnerable No. 1 seed is Kansas, but there is still a 70 percent chance that the Jayhawks survive until the second weekend.
In general, about two-thirds of all No. 2 seeds advance to the Sweet 16 and it is quite rare that all four survive the first weekend in any given tournament. That said, there is no clear No. 2 seed that appears vulnerable in 2022. I can think of one No. 2 seed that I would like to see lose to a certain Green and White-clad No. 7 seed (if they make it that far) but I will save that analysis for later in the week.
As for the No. 3 and No. 4 seeds, history tells us that about half of them will likely advance to the Sweet 16. Figure 4 gives some strong hints as to which of these seeds are more likely to be upset, and the news is not great for some Big Ten fans.
Interestingly, there are four potential second round games involving No. 3 or No. 4 seeds where the lower seeded team are projected to be favored in the second round. No. 6 LSU would be favored over No. 3 Wisconsin, No. 4 Houston is projected to be favored over No. 5 Illinois, and No. 5 UCONN is a likely pick'em versus No. 4 Arkansas. Furthermore, No. 5 Iowa is projected to be a big favorite over No. 4 Providence.
Naturally, this assumes that all of the No. 4 seeds win their first round match-ups, which Figure 3 suggests is not likely. Also note that No. 3 seed Purdue may be slightly vulnerable to No. 6 Texas, if the Longhorns can beat Virginia Tech in the first round.
Finally, the right side of Figure 4 presents the same data for the potential match-ups in the Sweet 16 and Elite Eight. This analysis does assume that the top seeds all advance, which is unlikely, but it does provide some hints as to which teams are more or less likely to advance to the Final Four.
For example, No. 1 Baylor looks potentially vulnerable to No. 4 UCLA in the Sweet 16, and even more vulnerable in a potential Regional Final showdown with No. 2 Kentucky. Figure 4 also suggests that No. 3 Tennessee and No. 3 Texas Tech both might be favored over No. 2 Villanova and No. 2 Duke in their respective regions. No. 1 Kansas would also be only a slight favorite over No. 2 Auburn in a potential Midwest Regional Final.
The analysis above will hopefully provide a good start in filling out a bracket. But, exactly how many upsets should we expect? How is each individual Region likely to shake out? Stay tuned for Part Two, coming soon.
The time has almost come for the Madness to begin once again. On Thursday at 12:15 p.m., the first round of the NCAA Tournament will tip off. But, before you fill out your bracket, I have some math-based tips to help you along.
On Tuesday, I broke down my method to identify how to spot the most likely upsets on the bracket. This analysis will go a long way toward making some informed office pool decisions. Today, in part two of this series, I wanted to approach the analysis from the point of view of each region and the tournament as a whole. In other words, yesterday I identified the pieces, and today I will tell you how to assemble them.
Before we dive into each region, there are two other figures that will add context to the analysis. The first one shows the average number of upsets to expect in each round and overall.
Figure 1 shows three sets of data (two simulations and one actual measurement). The blue bar shows the number of upsets projected in my simulation of the 2022 Tournament. The red bars show the average number of upsets per round in the set of simulations that I have performed on the last 19 tournaments. The green bar shows the actual average number of upsets back to 1985 when the tournament expanded to 64 teams.
Figure 2: Historical make-up of the Final Four, based on seeds. |
Region by Region Analysis
Let's now move on to look at each Region in more detail, starting in the west. In each case, I will present a table of data that summarizes each teams' odds to advance through each round of the Tournament, based on the projected point spreads for any possible tournament match-up.
In addition, each table contains a block of data that compares each teams' Kenpom efficiency and round-by-round odds to that of a historically average team with the same seed. This gives a clear indication of the relative strength or weakness of each team in each region.
West Region
- No. 10 Davidson over No. 7 Michigan State (don't @ me. It's just #math)
- No. 9 Memphis over No. 8 Boise State.
- No. 5 UCONN over No. 4 Arkansas
- No. 1 Gonzaga versus No. 5 UCONN
- No. 2 Duke versus No. 3 Texas Tech
South Region
Table 2: 2022 NCAA Tournament South Region Odds. |
- No. 10 Loyola over No. 7 Ohio State
- No. 9 TCU over No. 8 Seton Hall
- No. 11 Michigan over No. 6 Colorado State
- No. 5 Houston over No. 4 Illinois
- No. 10 Loyola over No. 2 Villanova
- No. 1 Arizona versus No. 5 Houston
- No. 3 Tennessee versus No. 10 Loyola
Midwest Region
Table 3: 2022 NCAA Tournament Midwest Region Odds. |
- No. 13 South Dakota State over No. 4 Providence
- No. 6 LSU over No. 3 Wisconsin
- No. 1 Kansas versus No. 5 Iowa
- No. 2 Auburn versus No. 6 LSU
East Region
Table 4: 2022 NCAA Tournament Midwest Region Odds. |
- No. 10 San Francisco over No. 7 Murray State
- No. 11 Virginia Tech over No. 6 Texas
- No. 1 Baylor versus No. 4 UCLA
- No. 2 Kentucky versus No. 3 Purdue
Final Analysis
Based on this analysis, I am projecting a Final Four consisting of the following teams:
- No. 1 Gonzaga versus No. 2 Kentucky
- No. 2 Kansas versus No. 3 Tennessee
- Gonzaga (West No. 1 seed)
- Arizona (South No. 1 seed)
- Kentucky (East No. 2 seed)
- Houston (South No. 5 seed)
- Baylor (East No. 1 seed)
- Kansas (Midwest No. 1 seed)
Comments
Post a Comment