Skip to main content

Fun Facts About the Vegas Spread

I have presented most or all of this data before, but I thought it would be a good time for a quick refresher.  A lot of people like to talk about the Vegas spread and how the purpose is just to get equal amounts of money on both sides.  This is certainly true.  However, there are a lot of very interesting mathematical / statistical facts about the spread that are also true and which can provide insight.  I found a cache of data on the interest which contains historical spread and game results data back to 2001.  This data set includes close to 12,000 games.  Here are a few interesting facts:

1) "Vegas always knows" on average

Based on my analysis, Vegas is the best predictor of the outcome of a given game.  If you plot the final spread vs. the average margin of victory, you get a very high correlation and a slope of 1.00:


2) However, the actual margin of victory vs. the spread has a ton of variance

If I just plot here all the results from 2017, the correlation is very weak and the scatter is tremendous:


Also notice all those data plots below the x-axis.  Those are upsets, which account for roughly 25% of all games in a given season, every season.  So, instead of just the average margin of victory, if one plots the standard deviation, you get this:


The standard deviation is 14-15 points, which if you think about it, is huge.  That is like saying, "Team X is favored to beat Team Y by 5 points, ± 2 touchdowns."  Also, this deviation from the spread is essentially normally distributed, as shown here:


So, another way to think about this is that roughly 2/3 of all games will have a margin of victory that is ±2 TDs from the spread.  The crazy thing is, that implies that a full 1/3 of games will have a margin of victory over 2 TD from the spread.  AND, for about 5% of games, you can expect the spread to be off by more than 4 TD, in either direction!  Considering there are 50-60 games a week, this result is likely to be observed 2-3 times a week.

3) The probability of victory is well correlated to the spread

If we take all this data together, we can also plot the odds that the favored team will win any given contest, based on the spread:


The trend line is derived from the data shown above where for each spread I assume a normally distributed actual result with a standard deviation of about 14 (14.112, to be exact, gave the minimum average deviation).  So, I only have one fitting parameter, which is nice.  Despite having over 12,000 data points, there is still some scatter, but if I use a 7-point box car smoothing function, it looks like this:


You can see how well the trend line fits the data. 

Now, there are still several very interesting questions.  Does Vegas adjust the lines based on the known betting habits of certain fan bases? They almost certainly do, but I have never been able to detect any clear bias in the data. Also, it would certainly be very easy to do that for just a handful of games, and that data would just get swamped by all the other data.  So, I just ignore this possibility.  If I can't measure it systematically, I don't care about it.

So, if a team is favored by 10 points, this translates to a ~75% chance of victory.  Does this mean: If those teams were to play 100 times, one team would (roughly) win 25 times and the other would win 75 times?  OR, does it simply mean that any in any given game with a 10-point spread, the favorite will win 75% of the time by an average margin of 10 points. I think that the second statement is clearly the correct one. 

The first statement is a fascinating concept in itself, as I think it is easy to fall into the trap of thinking that (for example) since MSU beat UofM last year, the 2017 MSU team would beat the 2017 UofM team 100% of the time if they played again.  That is certainly not true.  But, what I think the Vegas line does (in effect if not in intent) is to estimate this likelihood, based on all the information available at the time.  By the end of the season, I think it is pretty likely that they get close to this.  

Finally, for reference, here is the probability of victory data in tabular form:


Finally, I have one new piece of data to share, which is the likelihood of a given big upset, per year.  The table above shows that once the spread gets over ~28 points, the odds of the underdog winning drop to under 2%.  But, sometimes this is hard to understand how likely or unlikely this even actually is, especially since most of the contests in a give year have a spread that is with a 1-2 TDs. But, if you factor in the number of games typical in a given year with a given spread, you can create a kind of cumulative distribution function of the number of expected upsets observed in a given year, as a function of the spread.  The chart is shown here:


An upset when the spread gets above 14 is a once a week type of occurrence (not shown).  Once the spread gets over 25, we enter the realm of a "once in a season" event. A spread of 30 or higher is a once every 3 year event, and it goes up quickly.  The "10-year storm" is a spread of 33.5, the 50-yr storm is a spread of 37.5, which just so happens was the opening spread for the biggest upset on record, Stanford's 2007 upset of USC.

That is all for now, enjoy the games this weekend!

Comments

Popular posts from this blog

Dr. Green and White Helps You Fill Out Your Bracket (2024 Edition)

For as long as I can remember, I have loved the NCAA Basketball Tournament. I love the bracket. I love the underdogs. I love One Shining Moment. I even love the CBS theme music. As a kid I filled out hand-drawn brackets and scoured the morning newspaper for results of late night games. As I got older, I started tracking scores using a increasing complex set of spreadsheets. Over time, as my analysis became more sophisticated, I began to notice certain patterns to the Madness I have found that I can use modern analytics and computational tools to gain a better understanding of the tournament itself and perhaps even extract some hints as to how the tournament might play out. Last year, I used this analysis to correctly predict that No. 4 seed UConn win the National Title in addition to other notable upsets. There is no foolproof way to dominate your office pool, but it is possible to spot upsets that are more likely than others and teams that are likely to go on a run or flame out early.

The Case for Optimism

In my experience there are two kinds of Michigan State fans. First, there are the pessimists. These are the members of the Spartan fan base who always expect the worst. Any amount of success for the Green and White is viewed to be a temporary spat of good luck. Even in the years when Dantonio was winning the Rose Bowl and Izzo was going to the Final Four, dark times were always just around the bend. Then, there are the eternal optimists. This part of the Spartan fan base always bets on the "over." These fans expect to go to, and win, and bowl games every year. They expect that the Spartans can win or least be competitive in every game on the schedule. The optimists believe that Michigan State can be the best Big Ten athletic department in the state. When it comes to the 2023 Michigan State football team, the pessimists are having a field day. A major scandal, a fired head coach, a rash of decommitments, and a four-game losing streak will do that. Less than 24 months after hoi

2023 Final Playoff and New Year's Six Predictions

The conference championships have all been played and, in all honesty, last night's results were the absolute worst-case scenario for the Selection Committee. Michigan and Washington will almost certainly be given the No. 1 and No. 2 seed and be placed in the Sugar Bowl and the Rose Bowl respectively. But there are four other teams with a reasonable claim on the last two spots and I have no idea what the committee is going to do. Florida State is undefeated, but the Seminoles played the weakest schedule of the four candidates and their star quarterbac (Jordan Travis) suffered a season ending injury in the second-to-last game of the regular season. Florida State is outside of the Top 10 in both the FPI and in my power rankings. I also the Seminoles ranked No. 5 in my strength of record metric, behind two of the other three candidates. Georgia is the defending national champions and were previously ranked No. 1 coming into the week. But after losing to Alabama in the SEC Title game,