(This was originally posted to several internet sources in March of 2015)
Hi everyone!
I am new here, but I was encourage by a friend of mine to post some data that I have compiled related to whether certain college coaches are over-rated or under-rated based on NCAA tournament performance. The original question that I tried to answer was "Is Bo Ryan Over-rated." This analysis was primarily meant to answer that question, but I uncovered a lot more.
I have compiled a rather large NCAA tournament database over the years where I essentially have the results of every game since seeding began in 1979 in a spreadsheet such that I can performance various analyses on the data if I wish. It occurred to me over the weekend that it would be possible to measure if a coach is doing better than expected or not based on one of two different methods.
The first method is to compare a coach's record as a certain seed against the historical performance of all teams of that seed. For example, since 1979, 1-seeds have a win-loss record of 447-124 (0.783). Tom Izzo's lifetime record as a 1-seed is 16-3 (0.842). So, by this metric,Izzo does better than average as a 1-seed. More quantitatively, if an average coach playing 19 games as a 1-seed would be expected to win 78.3% of the time, you would expect that coach to win 14.87 of those games. Since Izzo has won 16, he is 1.13 games "above average" as a 1-seed. If you have the right set of data (which I do), you can then perform the same calculation for Izzo as a 2-seed, 3-seed, etc. and obtain on overall number of games above or below average for his career in the NCAA tournament. You can also perform the same calculation for all 566 coaches who have coached at least 1 game in the NCAA tournament.
The histogram summarizing this calculation is shown here:
As you can see, the distribution is roughly a bell curve centered at zero with an upper limit close to 6 games and a lower limit of around -4 games
However, this calculation troubles me a little bit because it does not take into account that some 1-seeds have to play a 4-seed and then a 2-seed to make the Final Four (like MSU did in 2000) while some other 1-seeds may draw a 12-seed and then a 11-seed (like MSU did in 2001). In other words, not all tournament draws are created equally. In order to correct for this, I instead compared each coach's performance to the average based on seed differential and not just raw seed. As it turns out, the data for this comparison is very linear. If the difference in seed between two teams is only 1 (a 1-seed vs. a 2-seed), the higher seed team only wins about 56% of the time, all the way up to a maximum differential of 15 (a 1-seed vs. a 16-seed) where the probability of winning is 100%. In case you are curious, this correlation looks like this (where the size of the data label roughly corresponds to the number of data points for each seed differential. As you can guess, some differentials are much less likely than others).
So, using the same methodology as described above, I was able to calculate each coach's performance against the average in terms of seed differential. When I do this, the histogram looks like this:
Again, it looks like a bell curve centered at zero, this time with a maximum value of 7 games (above average) and a minimum value of -5 (below average)
So, where go some of the notable coaches fall on these scales? Well, I removed all the data for coaches with less than 10 games in the Big Dance to simplify things, which cuts the number down to 121 coaches. In terms of performance just based on seed, the simplified histogram now looks like this, including the names of notable coaches:
Using this format, it is pretty easy to separate the men from the boys. Based on this metric, Tom Izzo has out-performed all other coaches in the history of the tournament since 1979 with a very impressive "score" of 5.65 games (i.e. wins over his career) above the expected/average performance . You can also see that there is a pack of 10 coaches or so, including Billy Donovan, Jim Calhoun, Rick Pitino, Callipari, Brad Stevens, and Coach K that are head and shoulders above everyone else. To get back to the original question, Bo Ryan is slightly above average at 1.15, just behind Bill Self at 1.34. Also notable are that coaches like Thad Matta and Mark Few have scores below 0.5, while legends such as Lute Olson and Bob Huggins score below average. Worse yet are Gene Keady at -3 and Rick Barnes at almost -3.5.
But, as I mentioned, I think a more fair comparison is the one using seed differentials. The simplified and labeled histogram based on this calculation is shown here:
By this metric, Izzo finishes 4th all time at 6.27, just behind Roy Williams, Villanova legend Rollie Massimino, and Rick Pitino. I suppose that this means that historically, Izzo has gotten a slightly easier draw than Pitino and Roy Williams. I must admit that I am surprised to see Roy Williams so high in this calculation. It seems like he needs a full NBA roster to be successful in the tournament. But, I guess if nothing else, this shows that Ol' Roy's teams are properly placed in the tournament, Furthermore, even though he has been upset 11 times, 7 of those have been in situations where the seed differential is 3 or less. That is better than most coaches do in those situations.
Other notable observations are that Billy Donovan does not so quite as well in this analysis, which suggests he has also benefited from some good higher seeds teams being upset before they reach his team. Also, Bielien checks in at a very respectable 3.23, while other notable coaches such as Jim Boehiem (0.40), Bill Self (0.23), and Thad Matta (0.63) are quite average.
That brings us back to the original question: is Bo Ryan underachiever? Well, based on this metric, his score is -0.47, significantly lower than his score using just the raw seed. This confirms our concern that he has benefited for a couple soft tournament draws. So, yes, I think that we can definitively say that he is an under-achiever, Now, that is not to say that he is a bad coach any more so than Lute Olson (-0.23) is a bad coach. But, Bo only ranks 82nd among the 121 coaches with over 10 tournament games, by this metric.
As a final note, I have always had a gut feeling that Coach K has been a bit of an under-achiever over the past 10-15 years. Based on the data above, that does not seem to be the case. However, I decided to see what would happen if I re-did the calculation using only games played in the "Tom Izzo era" of the tournament (since 1998). When I do this, Coach K numbers do, in fact, take a pretty big hit. His score based on raw seed is actually negative (-0.98), which his score based on seed differential is only 0.27, which is in the Bill Self range. Color me not so impressed.
Anyway, I hope that you all find this analysis as enjoyable as I did in putting it together.
Comments
Post a Comment