Skip to main content

How Good are Preseason Rankings?

Around this time of year, the various pre-season college football publications start appearing on the shelves. For some time now, I have often wondered if there was a good way to evaluate how good or bad the various preseason rankings really are.  This year, I decided that I would try to figure it out.  Now, it would straightforward to simply compare the various preseason rankings to the post-season CFB playoff ranking, AP ranking, or coaches poll.  But, that only tells the story for about 1/3 of all Division 1, and I was looking for something a bit more comprehensive.

From time to time, I have discussed and posted data based on an algorithm that I have developed to generate my own power rankings.  Since my method does assign a ranking to all 128 Div. 1 teams, is typically a reasonable predicter of Vegas spreads (more on that later), and since I also tabulate preseason predictions from various sources to support my annual preseason analysis (coming soon to a message board near you), it occurred to me that I had all the data that I needed to make this comparison. So, I went back over the data from the last 10 years or so and compared various full 128 team preseason rankings (from sources such as Phil Steele, Athlon’s, Lindys, ESPN (FPI), and SP+) and tabulated the average absolute difference between their rankings and my algorithm’s post-season rankings for all Division 1 teams.  The results are shown below:



Now, as you can see, I do not have a perfect data set to work with. I only have multiple source rankings for the last 5 years, and you also must trust that my algorithm is reasonable approximation of the relative strength of teams.  In any event, there are several interesting observations from this table:

First, for the limited data that I have, Phil Steele’s publication appears to give consistently smallest error between the preseason and my simulated post-season rankings.  I have his data as the best in 4 of the 5 years where I have rankings from multiple sources. He always advertises that his rankings are the most accurate, and I cannot dispute that with this analysis.  Second, that said, there is not a huge difference between the different publications.  So, there is no strong reason to rush out and buy any one of these publications over the other based on the rankings alone (I will comment a little more on this later). Third, all the publications don’t seem to get that close to the final rankings.  The average deviations are all in the range of 15-20 slots which is an average error of ~15%.  That does not seem great to me.

I wanted to dive a little deeper into the third point.  As the table indicates, I have the most historical data on Phil Steele’s rankings, so I decided to go back ten years and compare the all of his preseason rankings to all my post-season rankings.  There are several ways to look at this data, but I find the most informative to be a histogram of the deviations, a scatter plot, and a plot of the average post season ranking as function of the initial Phil Steele ranking (basically the scatter plot data where the y-axis instead contains the average and standard deviation / error bars for each rank instead of each individual data point). Once again, there are several conclusions we can draw from this data.
First, the histogram gives us an idea of the distribution of the variance. It is fairly bell shaped, with 24% of the picks falling within +/- 5 slots of final rankings, and 41% falling between +/- 10 slots.  But, the tails of the distribution are also fairly long.  23% of all of Steele’s picks are not within 30 slots of the final ranking.  The scatter plot tells a very similar story and in this case, we can see that the correlation (R squared = 0.66) is OK, but not that great. The scatter plot also tends to highlight the real misses, like when Steele ranks a team in his top 20 (like Illinois in 2009) but then this team winds up 3-9 with a ranking in the 80s by my algorithm, or when teams like Utah St and San Jose St. in 2012 are ranked around 100 by Steele, but wind up ranked in the top 25 by my algorithm and the national polls.  The plot of the average ranking vs. initial ranking data shows the Phil Steele data in perhaps the best light.  This plot shows that for any given ranking, on average, Steele is pretty close, but the error is still quite large.  Notably, the deviation is much smaller for teams in Phil Steele’s ~Top 5.  Historically, those teams do usually wind up having great seasons, but there are exceptions (like the 2007 Louisville team, which started ranked #4, but who ended 6-6).  That said, the same trend is also found at the bottom end of the chart, so it might have more to do with the fact that teams ranked high (or low) only really have one direction to go: down (or up).  That fact is best illustrated by a plot of standard deviation of post season ranking as compared to preseason rankings (basically, the plot of the error bars as a function of preseason rankings) which is show here with a clear parabolic trend.





What is perhaps the most interesting aspect of all of this to me harkens back to my second observation shown in the 1st table above: the fact that the deviations from the different publications are all basically the same for a given year.  To visualize this, I plotted the predictions from the two publications for which I have the most data tabulated (Phil Steele and Athlons) and plotted that in a scatter plot, which is shown here:



Not surprisingly, the correlation between the two predictions is rather high (R-squared = 0.91) and much higher than the correlation to reality, so to speak.  So, as my first conclusion, I think that we can say that pre-season predictions are OK, but not great (they are certainly not destiny) and they agree with each other far more than they will agree with the actual results on the field.

This analysis led me to think about another interesting topic which is related to the first.  Now that we have looked at the robustness of preseason rankings, what about in-season predictions?  More specifically, what about metrics such as EPSN’s vaunted FPI?  In the 2016 season, I decided to put the FPI to the test alongside my own algorithm to see how they performed. As it turns out, this is a tricky question because defining “performance” in this context is not as easy as you might think.  A big part of the reason why is that there is generally a very poor correlation between any predicted margin of victory and the actual result.  The best predictor, I suppose not surprisingly, is the Vegas Spread, and a plot of the scatter plot of the actual game margins vs. the opening Vegas spreads for the entire 2016 season is shown here.  As you can see the R-squared is a pathetic 0.214.  But, this is better than the FPI, which only mustered an R-squared of 0.196 and, sadly, my algorithm, which only mustered an R-squared of 0.167.  I won’t bother to show you those plots, as they both look like shotgun blasts. 



Last year as I poured through the FPI data, I noticed something odd: it was quite rare for the FPI to predict a Vegas upset.  I only counted 37 predicted upsets total out of over 750 games (5%), which is interesting because historically about 25% of all college games wind up being upsets per Vegas.  2016 saw over 200 upsets total.  My algorithm picked over 80 upsets for the season.  Granted, it was only right concerning the upset 37% of the time (which is below my algorithm’s historical average of 40%), but the FPI only got 46% of its upset picks correct.  When I plotted the full year projected margins from the FPI versus the Vegas Spread (See below), you see that the correlation is quite good (R-squared = 0.86).  By comparison, my algorithm did not do quite as well, but it still fairly highly correlated (R-squared = 0.72). 




From all of this, I come to my second main conclusion from all this analysis:  In-season algorithms don’t do a good job of predicting the outcomes of actual games, but they can do a good job of predicting the Vegas spread.  In this regard, the FPI (and to a lesser extent, my algorithm) does have value in doing things such as projecting point spreads out 2-3 weeks in advance.  That type of analysis is appears to be fairly robust.  I also must concede that the FPI does a better job of predicting these spreads than my algorithm does (which I would expect considering they most likely have more than one dude working on it in his spare time).  But, you could argue that the FPI is so good at predicting the spread that it doesn’t add much to the discussion.  It is on some level too conservative.  At least my algorithm takes some chances and will make more than 1-2 upset picks a week.  But, at the end of the day, the gold standard is the Vegas spread, which honestly makes sense.  After all, if there was a computer program out there that could beat Vegas, somebody would be very rich and they would certainly not tell the rest of us about it.

So, with this knowledge, perhaps the most useful figure that I can leave you with is the following:  the 5-point boxcar averaged plot of the probability of the favored team winning as a function of the opening Vegas spread for all college games back to 2009.  As you can see, if the data is smoothed, it forms a nice quadratic curve from a 50-50 toss-up to a virtual sure thing once the spread reaches around 30.  (In reality, there have been a total of 2 upsets in games where the spread exceeds 30 since 2009, but the frequency is less than 1%).  The fit is not perfect, but the equation on the chart is very simple and easy to remember.  I would imagine the line should asymptotically approach 100%, but never actually reach it, because in college football, I believe the underdog always has a chance.



This brings us to my final conclusion for this piece:  college football is unpredictable, and that is why we love it. 


Comments

Popular posts from this blog

Dr. Green and White Helps You Fill Out Your Bracket (2024 Edition)

For as long as I can remember, I have loved the NCAA Basketball Tournament. I love the bracket. I love the underdogs. I love One Shining Moment. I even love the CBS theme music. As a kid I filled out hand-drawn brackets and scoured the morning newspaper for results of late night games. As I got older, I started tracking scores using a increasing complex set of spreadsheets. Over time, as my analysis became more sophisticated, I began to notice certain patterns to the Madness I have found that I can use modern analytics and computational tools to gain a better understanding of the tournament itself and perhaps even extract some hints as to how the tournament might play out. Last year, I used this analysis to correctly predict that No. 4 seed UConn win the National Title in addition to other notable upsets. There is no foolproof way to dominate your office pool, but it is possible to spot upsets that are more likely than others and teams that are likely to go on a run or flame out early....

2024 Week Eight Preview: OK Computer

Playing the first game after a bye week is like waking up from a nap. It is a little tough to predict how the body will respond. If a nap comes at just the right time and lasts for just the right length of time, it can be very refreshing and rejuvenating. But sometimes waking up for a nap can be rough. It can cause a disorienting, groggy feeling like suddenly two plus two equals five and that down is the new up. Based on the way the three weeks prior to the bye week went, last week's break at the midpoint of the season came at exactly the right time for the Spartans. Facing one top five team is challenging enough. Facing two top five teams on consecutive weekends including almost 5,000 miles of travel is something else entirely. But how will the rested Spartans look on the field come Saturday night? It is hard to predict what we are going to get. It is the classic "rest versus rust," million dollar question.  I prefer to be optimistic and to believe that the Spartans will...

2024 Week Seven Preview: Intermission

It is hard to believe that we are already halfway through the Michigan State Spartans' 2024 season. The Green and White currently sit at 3-3, having just lost two games straight to teams both ranked in the top three nationally.  Despite the current losing streak, Michigan State is actually slightly ahead of schedule. While the Spartans' schedule currently grades out to be harder than expected when I conducted the analysis this summer (by 0.7 games), Michigan State's current odds to go to a bowl game (46%) are 10 percentage points higher than what I projected.  In Week Seven, Michigan State has drawn a much needed bye. Think about it as an intermission of sorts. The Spartans' mission this weekend is to rest, heal, reflect on the first half of the season, and prepare for back half of the schedule with the goal of qualifying for the bowl game. Michigan State's team and staff may be taking it easy, but data and Vegas never sleep. Today's piece will focus more on the...