So, the idea for this fanpost came about because of some discussion about Sackseer and how effective it really is at predicting sack performance. We seem to have a contingent here at BTB who seems to think that Sackseer has a probative value in predicting pass-rushing performance.
I wonder, how valid this belief is, though, given:
- The supporter of the stat is a paid site (footballoutsiders.com) which could benefit financially if they're able to get the stat legitimized.
- There's no peer review of the work nor ANY WAY TO PEER REVIEW the work, as it's a "proprietary formula"
- The main variable which is related to success according to Sackseer is Missed Eligible Games in College. Not Performance, not speed, not strength. But number of games missed.
- The person who came up with the formula is an attorney.
Now, this isn't necessarily a problem in coming up with knowledge, per se. But there are some other squirrelly things in the methodology that I've read which leads me to the conclusion that the author being a lawyer is a direct hindrance of trusting that the stat is actually effective. Among those are, the following:
- The author is not statistically-saavy. The author states that:
The trends that SackSEER identifies for edge rushers drafted in the first two rounds also persist when later-round edge rushers are modeled. For instance, SackSEER would have identified Robert Mathis and Adalius Thomas as top edge rusher prospects. However, the trends are not quite as strong and the projections not quite as accurate (the seven-round regression yields R-Squared of .27). Nonetheless, "seven-round SackSEER" does have some interesting implications for the two-round model.Unfortunately, it doesn't do anything of the sort since the seven-round model also contains the first two rounds which has the data that the author used to create his formula. This fundamental lack of understanding of statistics leads me to suspect that the author not coming from a statistical (or heck even mathematical/engineering) background seriously hinders his methodology.
First, the seven-round model demonstrates that it is unlikely that SackSEER is a product of "data-mining bias"
- To be a good lawyer, you must be an advocate of a position.
To be a good scientist/statistician, you must NOT be an advocate of a position. You must let the data tell you what is correct and what is not.
Take a look at some of the lines in the quote above:
The trends that SackSEER identifies for edge rushers drafted in the first two rounds also persist when later-round edge rushers are modeled. For instance,Then, he goes on to list some examples and state that the relationships are not as strong despite the fact that the R-squared drops from .42 which is a decent relationship to .27 which isn't that strong (and that's not taking into account that the real relationship outside the first two rounds is almost definitely worse that a .27 R-squared relationship).
It seems strongly to me that he's advocating for something here, rather than let the data tell him what's going on.
Anyways, those are just some of my concerns with the Sackseer stat and its methodology. But I thought it'd be interesting to actually look at the data and the predictions and see how well the predictions are doing in relationship to the actual data.
Now, I acknowledge that there's a danger in prematurely making conclusions about the data since it is SUPPOSED to predict 5 year sack numbers. But we can get an early sense of whether there's anything interesting in the predictions or whether there might be some red flags about the stat.
A caveat I want to offer when looking at the data is that I'm using pro-ration as a comparison with the prediction. Now, pro-ration is probably not the best method of predicting what the player will do in the future. After all, some of them will grow and become better players (and some haven't even played in games). However, there IS also the downside of a potential career ending injury and there's also the issue that these are 1st and 2nd round talents, so for the majority of them they SHOULD be having an impact their first year.
So, I'm going to go with pro-ration for now mainly since it's easily understandable and it's easy to implement. If you have a better idea, please feel free to suggest.
Here are the numbers:
|Player||Prediction||2010 Sacks||Proration||Vertical||Short Shuttle||SRAM||Missed Games|
Hmmm, it doesn't look good for sackseer at this time. The lowest two predictions are currently doing the best out of the pass-rushers predicted for.
Just for kicks, I thought I'd take a look at the other player drafted in the 2nd round and compare their raw numbers (remember, I don't have access to the real sackseer numbers)
|Player||2010 Sacks||Proration||Vertical||Short Shuttle||Production||Missed Games|
|Koa Misi||4.5||22.5||38"||4.27||10.5 S in 38 G||14|
|Jermaine Cunningham||1.0||5.0||35"||?||18.5 S in 45 G||10|
|Jason Worilds||1.0||5.0||38"||4.29||15 S in 39 G||13|
Misi looks somewhat comparable to Kindle. His Vertical is slightly higher, his shuttle time is much faster, but his production isn't as good, and his games missed is significantly worse. So guess about 18.8 prediction?
Cunningham also looks somewhat comparable to Kindle. But we can't really know since we don't have a Shuttle time on him.
Worilds looks like he should be the best of the bunch since his vertical is tied for the best, and his short shuttle is comparable to Misi, and his production is better, and his missed eligible games is about the same as Misi.