The 2018 NCAA Men's basketball tournament had some pretty special moments. From the controversial quad system to a 16-seed beating a 1-seed to an 11-seed making the Final Four to crazy finishes, it was a pretty a fun 3 weeks (except for MSU fans. It honestly kind of sucked for us). But, was it really such a strange tournament? I mean, they call it "March Madness" for a reason, right? Well, as we will see, it was really not that strange at all, at the end of the day.
As for the total number of upsets (by seed only) the 2018 Tournament produced a total of 20 upsets in 67 games. The average number of upsets per year back to the 64 team expansion in 1985 is 17.6 ± 3. But, as you can see from the histogram below, there is a fair amount of scatter in the data with a minimum value of 12 and a maximum of 23. While 20 is a bit high based on the average, it is actual the mode (most frequently appearing value) of the distribution.
If we instead look at the upset distribution on a per round basis, we again see that the number of upsets in a given round in 2018 were pretty normal. 9 first round upsets is again the mode and very close to the average (8.1). Similarly, the six 2nd round upsets is pretty normal. It is really only the four upsets in the Sweet 16 round that is a bit high.
As for the late rounds, there was only one true seed upset observed from the regional final round forward (11-Loyola over 9-Kansas State), which is below the historical average of 2.5 for the last 3 rounds.
But, all this analysis got me thinking; not all upsets are created equally. Some upsets (like 16-UMBC's upset of 1-Virginia) are historic, while others (such as 9-Alabama over 8-VA Tech) are very pedestrian. So, I considered the question of how to define a "big upset." On obvious metric to use is the difference between the seeds of the two teams involved in the upset. The bigger this differential, the bigger the upset. But, this would also seem to be a bit round-dependent.
In the first round, for example, a 8-seed, 7-seeds, and 6-seed, all get upset fairly frequently (over 35% of the time). The 5-12 upset is also fairly common (32% of the time). But, the 4-13 upset is noticeably less common (only 20% of the time) as are upsets of 3-, 2-, or 1-seeds. So, for 1st round games, I set the seed differential cut-off for a "big upset" at 8.
In the second round, in theory there should not be too many seeds large enough around to cause a "big upset" if the seed differential cut-off remains at 8. But, if we think about the typical seed match-ups, a fairly clear rule of thumb appears: reduce the cut-off to 4. In my view a 8/9 seed beating a 1-seed in the second round is a big upset, as is a 7/10 seed beating a 2-seed. However a 6-seed over a 3-seed is not a huge upset, but an 11-seed over a 3-seed is (not that we have any experience with that...)
Once you get to the Sweet 16 and beyond, I propose to tighten the criteria to a seed differential of 3. My feeling is that a 4/5 beating a 1 seed is still a big deal. If a 3-seed beats a 2-seed, that is fairly common, but a 6-seed over a 2-seed is notable. Interestingly, as I consult my table / graph of upset probabilities by round (see here), all of the upsets listed above that I consider to be "big" happen only 30% of the time or less. In effect, this is the cut-off that I have selected.
With this all in mind, and using these criteria, I calculated the number of big upsets per year and found that 2018 was on the high side with 11 total, tied for 2nd all time. The histogram is shown below, along with the histogram for the final 3 rounds of the tournament.
Upsets aside, the fact a 3-seed and an 11-seed both made the Final Four is pretty rare, right? Well, not exactly. While an 11-seed is the lowest seed ever to make the Final Four, it hs happened three times before (LSU in 1986, George Mason in 2006, and VCU in 2011). Furthermore, if you average the seed value of the highest, 2nd highest, 3rd highest, and lowest seed in each Final Four, you get 1.1, 1.7, 3.0, and 5.7. The actual distributions are shown below:
So, having at least two 1-seeds happens 54% of the time. Having a 3-seed or lower as the 3rd highest seen happens 44% of the time, so that is also not that strange. As for the lowest seed, it is a 5-seed or lower 56% the time, and an 8-seed or lower has appeared in the Final Four 6 times in the past 8 years. Loyola's run to the Final Four was amazing, but it was not that far from the norm.
If you are looking for a final bit of trivia about the strangeness of the 2018 Tournament, I will leave you with this: it was a great year to be a 9-seed. 9-seeds are typically terrible once / if they make it out of the first round. 9-seeds are 9-67 (11.8%) in 2nd round games, which is abysmal. 8-seeds, in contrast, are 16-68 (19%) which is not awesome, but is almost twice as good. That, in itself, is weird, considering 8/9 games are toss ups. Until this year, only four other 9-seeds had every made it to the Regional Final round. Kansas State and Florida State pushed that number to 6 this year.
Also of note, it was good year to be a 5-seed. For only the 4th time in history, three 5-seeds advanced to the Sweet 16. Come to think of it, the Final Four was two 1-seeds, a 3-seed, and an 11-seed, two 7-seeds made the Sweet 16. So, despite the fact that the majority of the data suggest the 2018 tournament was fairly normal, I suppose it was a bit "odd."
As for the total number of upsets (by seed only) the 2018 Tournament produced a total of 20 upsets in 67 games. The average number of upsets per year back to the 64 team expansion in 1985 is 17.6 ± 3. But, as you can see from the histogram below, there is a fair amount of scatter in the data with a minimum value of 12 and a maximum of 23. While 20 is a bit high based on the average, it is actual the mode (most frequently appearing value) of the distribution.
If we instead look at the upset distribution on a per round basis, we again see that the number of upsets in a given round in 2018 were pretty normal. 9 first round upsets is again the mode and very close to the average (8.1). Similarly, the six 2nd round upsets is pretty normal. It is really only the four upsets in the Sweet 16 round that is a bit high.
But, all this analysis got me thinking; not all upsets are created equally. Some upsets (like 16-UMBC's upset of 1-Virginia) are historic, while others (such as 9-Alabama over 8-VA Tech) are very pedestrian. So, I considered the question of how to define a "big upset." On obvious metric to use is the difference between the seeds of the two teams involved in the upset. The bigger this differential, the bigger the upset. But, this would also seem to be a bit round-dependent.
In the first round, for example, a 8-seed, 7-seeds, and 6-seed, all get upset fairly frequently (over 35% of the time). The 5-12 upset is also fairly common (32% of the time). But, the 4-13 upset is noticeably less common (only 20% of the time) as are upsets of 3-, 2-, or 1-seeds. So, for 1st round games, I set the seed differential cut-off for a "big upset" at 8.
In the second round, in theory there should not be too many seeds large enough around to cause a "big upset" if the seed differential cut-off remains at 8. But, if we think about the typical seed match-ups, a fairly clear rule of thumb appears: reduce the cut-off to 4. In my view a 8/9 seed beating a 1-seed in the second round is a big upset, as is a 7/10 seed beating a 2-seed. However a 6-seed over a 3-seed is not a huge upset, but an 11-seed over a 3-seed is (not that we have any experience with that...)
Once you get to the Sweet 16 and beyond, I propose to tighten the criteria to a seed differential of 3. My feeling is that a 4/5 beating a 1 seed is still a big deal. If a 3-seed beats a 2-seed, that is fairly common, but a 6-seed over a 2-seed is notable. Interestingly, as I consult my table / graph of upset probabilities by round (see here), all of the upsets listed above that I consider to be "big" happen only 30% of the time or less. In effect, this is the cut-off that I have selected.
With this all in mind, and using these criteria, I calculated the number of big upsets per year and found that 2018 was on the high side with 11 total, tied for 2nd all time. The histogram is shown below, along with the histogram for the final 3 rounds of the tournament.
Upsets aside, the fact a 3-seed and an 11-seed both made the Final Four is pretty rare, right? Well, not exactly. While an 11-seed is the lowest seed ever to make the Final Four, it hs happened three times before (LSU in 1986, George Mason in 2006, and VCU in 2011). Furthermore, if you average the seed value of the highest, 2nd highest, 3rd highest, and lowest seed in each Final Four, you get 1.1, 1.7, 3.0, and 5.7. The actual distributions are shown below:
So, having at least two 1-seeds happens 54% of the time. Having a 3-seed or lower as the 3rd highest seen happens 44% of the time, so that is also not that strange. As for the lowest seed, it is a 5-seed or lower 56% the time, and an 8-seed or lower has appeared in the Final Four 6 times in the past 8 years. Loyola's run to the Final Four was amazing, but it was not that far from the norm.
If you are looking for a final bit of trivia about the strangeness of the 2018 Tournament, I will leave you with this: it was a great year to be a 9-seed. 9-seeds are typically terrible once / if they make it out of the first round. 9-seeds are 9-67 (11.8%) in 2nd round games, which is abysmal. 8-seeds, in contrast, are 16-68 (19%) which is not awesome, but is almost twice as good. That, in itself, is weird, considering 8/9 games are toss ups. Until this year, only four other 9-seeds had every made it to the Regional Final round. Kansas State and Florida State pushed that number to 6 this year.
Also of note, it was good year to be a 5-seed. For only the 4th time in history, three 5-seeds advanced to the Sweet 16. Come to think of it, the Final Four was two 1-seeds, a 3-seed, and an 11-seed, two 7-seeds made the Sweet 16. So, despite the fact that the majority of the data suggest the 2018 tournament was fairly normal, I suppose it was a bit "odd."
Comments
Post a Comment