Skip to main content

NCAA Tournament Analysis: The Sweet Sixteen

Basketball season may be over for the Michigan State Spartans, but the NCAA Tournament will be continuing with the Sweet 16 this coming Saturday. After the bracket was released, I presented my detailed analysis of the bracket and made some math-based predictions about how the first weekend and entire tournament might play out.

While it would be more fun to write about a potential MSU-Alabama match-up in the Sweet 16 (which might have actually come to fruition had the Spartans simply boxed out properly on a rebound in the final seconds of the First Four contest against UCLA) it is still fun to reflect on the results of the first weekend and to take another math-based looked at the remaining tournament field. If nothing else, in the great words of Coach Mark Dantonio, it is time to complete this circle.

Let's start with a review of the wild action of the first two rounds.

Results of Rounds One and Two

In my analysis of the bracket, I presented data that showed that the average number of upset to expect in the first round of the NCAA Tournament is eight, and is the second round, that number is five. When I looked at the projected odds for each of the first round and projected second round games, I identified a few match-ups with better than average odds for an upset. 

Based on this analysis, I made a historically average number of upset picks and then carried this analysis through to the Final Four and eventual champion. In the real tournament, there were an above average number of seed upsets in both rounds (ten and six to be exact). Table 1 below summarizes my upset picks and the actual upsets through two rounds.

Table 1: Summary of NCAA Tournament upset picks and upset results through two rounds

Of the 13 total upset predictions that I made this year, a total of six were correct, two I give myself partial credit (the yellow 'O's) and five were wrong. There were eight additional upsets that I did not pick. On balance I think that my method did OK.

The biggest success was that I correctly predicted one of the biggest upsets of the weekend: No. 14 Abilene Christian's upset win over No.3 Texas. I also picked No. 13 Ohio to upset No. 4 Virginia. I am also giving myself credit for taking UCLA/MSU to beat BYU, even though I clearly thinking that it would be the Spartans and not the Bruins to win both of those games. Most office pools take the First Four winner as either/or, so it still counts in my book. Hey, there has to be some benefit of the First Four, right?

The other partial credit comes from the fact that I correctly bounced No. 3 West Virginia and No. 4 Oklahoma State in the second round, I just had the triumphant opponent wrong. From an office pool point of view, this also has some value.

As for a more visual view of the upsets, I repeat below one of the main figures that I used to make picks last week, with the actual upsets highlighted in bold, red tex.

Figure 1: 2021 odds for the first round games compared to the average historical odds for each seed pair

From a certain point of view, in retrospect, my analysis perhaps did a better job than I originally thought. Of the 32 total first round games, only eight contest clearly fell below the average line, which denotes a more likely upset. Five of those games ended in an upset and Liberty and Colgate were both very competitive in their games. It was only the LSU - Saint Bonaventure game that bucked this trend and I didn't even make that pick.

I probably should have more seriously considered the possible Purdue upset by North Texas, but I decided to ignore the warnings of my own analysis. As for the other five upsets, three of them (Maryland, Syracuse, and UCLA) all lie close to the average line.

Only two of the 10 first round upsets were truly surprising: Oregon State and Oral Roberts. As for Oregon State, their upset of Tennessee perhaps could have been predicted had I simply remembered the Coach Rick Barnes was on the bench and he is absolutely notorious for losing to lower seeds. As for Ohio State, my math suggests that upset on the No. 1 and No. 2 line are simply random bad luck. It's happened to the best of us...

Figure 2 gives a similar retrospective analysis of the second round games.

Figure 2: 2021 odds for second round games compared to the average historical odds for each seed pair

As for the predictability of the six second round upsets, the results are less clear. In total, seven of the 16 games had above average upset odds and only three of those games ended in upsets. In this case, I did correctly pick USC's upset of Kansas and West Virginia's loss, but Texas Tech and Maryland (actually UCONN) let me down on the upset front.

It is also clear that I let my belief in the strength of the Big Ten cloud my analysis a bit. The data did suggest that Wisconsin had a shot to beat Baylor, and that was the pick that I made. BUT, the data suggested that Loyola beating Illinois was actually more likely. If I couple that with the in-state rivalry aspect (similar to my analysis of Texas and Abilene Christian) then perhaps I should have seen than one coming. My faith in the Big Ten also caused two of my Final Four picks: Illinois and Ohio State to be knocked out very early.

As for the other three upsets, the Oregon State - Oklahoma State game was right on the average line, but Iowa and Florida both had better than expected odds to avoid an upset. Once again, you win some and you lose some.

How Mad Was It?

Based on a few different measures, such as the number of double digit upsets, the 2021 Tournament looks to be one of the most chaotic Tournaments on record. That said, measures like just counting double digit seed underdogs are not very mathematically precise. Fortunately, there is a better way to compare the relative madness of different months of March.

In order to quantify the relative likelihood of a specific first and second round outcomes in any given year, one just need to know the odds of each individual game outcome. You can then multiple those probabilities together to get the overall odds.

Fortunately, I happen to have just these odds, as derived from Kenpom efficiency margin data. In fact, these are exactly the numbers that I use to run my Monte Carlo simulations. I also happen to have performed the same calculation on each Tournament back to the beginning of the Kenpom era (2002).

The result tell me that the odds for the specific first round outcome in 2021 were:

1 in 81.5 million.

That is on the high side. The first round odds in 2013, 2016, and 2018 were similar in magnitude, but a little lower. However, there is still one other year, 2012, that still holds the record for the least likely first round outcome at:

1 in 800 million.

This was the year where both Duke and Missouri were upset as No. 2 seeds by No. 15 seeds Lehigh and Norfolk State respectively. The year 2012 also had 10 total first round upsets, including a No. 4 seed and two No. 5 seeds. However, the second round in 2012 recorded only two additional upsets, and the odds of the specific outcome after two rounds was "only"

1 in 1.3 trillion.

This is actually slightly lower than the odds after two rounds in 2018 when No. 1 Virginia was upset in the first round by No. 16 UMBC, and then the second round saw the upset of a second No. 1 seed (Xavier) and half of the No. 2 seeds (Cincinnati and North Carolina). The odds of seeing the exact scenario in 2018 were:

1 in 1.8 trillion.

But, that pales in comparison to the tally from 2021. The qualitative estimates are, in fact, correct. The odds that I calculate for the current tournament results after two rounds are:

1 in 6.5 trillion, 

which are the longest odds of the Kenpom Era by a factor of three.

Analyzing the Sweet 16

So, what's next? With 16 teams remaining it is time to wipe the slate clean and try to make some new predictions about how the rest of the tournament will play out. I will start with the results of a new Monte Carlo simulation of the remainder of the tournament.

Table 2: Monte Carlo Simulation results starting form the Sweet 16

I decided to keep the pre-tournament Kenpom efficiency values in this case, so I don't want to get too hung up on the details. What this tells me is that Gonzaga is still a heavy favorite to win it all (43 percent) and that the Zags have about a 75 percent chance to reach the Final Four.

Then, there are three teams next in line with similar odds to cut down the nets: Michigan, Houston, and Baylor (around 13 percent odds each). Each of those teams is 50-50 to advance to the Final Four. Then, there is a group of dark horse teams (Loyola, Alabama, Arkansas, USC, and Villanova) with between two and five percent odds to win the Title.

I also included a column in this table labeled "normalized final four odds." This is my attempted to estimate the relative ease or difficultly of each teams path to the Final Four. The calculation involves estimating the odds of each team to advance to the Final Four if they were only as good as a benchmark team with an efficiency margin of +19.00 (an average high-major team).

Higher percentages mean an easier path, which is the case for Arkansas and Houston, as they both will face double-digit seeds Oral Roberts and Syracuse in round three. On the opposite end of the spectrum is Creighton (who will face Gonzaga). The Blue jays grade out to have the most difficult remaining path.

As for potential upsets to look out for in the next few rounds, Figure 3 below compares the odds in each contest relative to the historical average for each given seed combination. This is essentially the same analysis shown above in Figures 1 and 2.

Figure 3: 2021 odds for the Sweet 16 games (left) and potential regional final games (right) compared to the average historical odds for each seed pair

In this case, for the Sweet 16 games, I am using the odds from the actual opening Vegas lines, as opposed to the Kenpom projected odds. For the region final round (Figure 3, right) I revert back to the odds from Kenpom.

Based on both the original simulation results and the expected value calculations, two upsets are expected in the Sweet 16 round. Based on the left panel of Figure 3, the most likely upsets are for No.1 Michigan to lose to Florida State and No. 6 USC to lose to No. 7 Oregon.

That said, USC actually has better odds than an average No. 6 versus No. 7 seed match-up, which makes me balk at that pick a little. The next most likely upset would be for No. 11 Syracuse to beat No. 2 Houston, which just feels annoyingly correct. If this were to come to pass, Jim Boeheim would surpass Tom Izzo with the most upset wins in Tournament history at 16. Dislike.

As for the regional final round, the odds suggest one out of the four games will end in an upset. On the right panel of Figure 3, I compare the teams under the assumption that the higher seeds all advance. In this scenario, the most likely upset in No. 8 Loyola to beat No. 2 Houston (if the Cougars can solve the Syracuse zone). After the beat-down that the Ramblers gave to the Illini last weekend, I would totally buy that. 

If I were to start again from the Sweet 16 round, I believe that I would take Florida State and Syracuse to win, and then just the top seeds in the next round, which would give me a Final Four of
  • No. 1 Gonzaga
  • No. 1 Baylor
  • No. 8 Loyola-Chicago
  • No. 2 Alabama
Alternatively, I could see Oregon beating USC, but Houston beating Syracuse. I would take the same Final Four in both scenarios.

This Final Four is a reasonable distribution of seeds and I think that it is total reasonable based on the eyeball test from last weekend. I would take Gonzaga over Alabama, and then I will take a flyer on Loyola to upset Baylor before succumbing to the machine that is Gonzaga.

That is all for today. Enjoy what is left of March Madness and as always, Go Green.

Comments

Popular posts from this blog

Dr. Green and White Helps You Fill Out Your Bracket (2024 Edition)

For as long as I can remember, I have loved the NCAA Basketball Tournament. I love the bracket. I love the underdogs. I love One Shining Moment. I even love the CBS theme music. As a kid I filled out hand-drawn brackets and scoured the morning newspaper for results of late night games. As I got older, I started tracking scores using a increasing complex set of spreadsheets. Over time, as my analysis became more sophisticated, I began to notice certain patterns to the Madness I have found that I can use modern analytics and computational tools to gain a better understanding of the tournament itself and perhaps even extract some hints as to how the tournament might play out. Last year, I used this analysis to correctly predict that No. 4 seed UConn win the National Title in addition to other notable upsets. There is no foolproof way to dominate your office pool, but it is possible to spot upsets that are more likely than others and teams that are likely to go on a run or flame out early.

The Case for Optimism

In my experience there are two kinds of Michigan State fans. First, there are the pessimists. These are the members of the Spartan fan base who always expect the worst. Any amount of success for the Green and White is viewed to be a temporary spat of good luck. Even in the years when Dantonio was winning the Rose Bowl and Izzo was going to the Final Four, dark times were always just around the bend. Then, there are the eternal optimists. This part of the Spartan fan base always bets on the "over." These fans expect to go to, and win, and bowl games every year. They expect that the Spartans can win or least be competitive in every game on the schedule. The optimists believe that Michigan State can be the best Big Ten athletic department in the state. When it comes to the 2023 Michigan State football team, the pessimists are having a field day. A major scandal, a fired head coach, a rash of decommitments, and a four-game losing streak will do that. Less than 24 months after hoi

2023 Final Playoff and New Year's Six Predictions

The conference championships have all been played and, in all honesty, last night's results were the absolute worst-case scenario for the Selection Committee. Michigan and Washington will almost certainly be given the No. 1 and No. 2 seed and be placed in the Sugar Bowl and the Rose Bowl respectively. But there are four other teams with a reasonable claim on the last two spots and I have no idea what the committee is going to do. Florida State is undefeated, but the Seminoles played the weakest schedule of the four candidates and their star quarterbac (Jordan Travis) suffered a season ending injury in the second-to-last game of the regular season. Florida State is outside of the Top 10 in both the FPI and in my power rankings. I also the Seminoles ranked No. 5 in my strength of record metric, behind two of the other three candidates. Georgia is the defending national champions and were previously ranked No. 1 coming into the week. But after losing to Alabama in the SEC Title game,