How to Beat the Point Spread
Last season I received several comments asking me to make game predictions against the spread. I tend to resist doing spread predictions. I'm not interested in gambling myself--I just don't have that gene. Plus, I'm more interested in analyzing the sport of football for its own sake rather than exploiting it for money. But I will oblige some of my readers (and my own curiosity) in this post.
As a basis for my investigation of office pool strategies, I've recently developed a database of games that includes spread data, so I really couldn't help seeing if I could beat the spread with any regularity.
Additionally, this post at Sabermetric Research blog suggested that most NFL games have obvious favorites, so there are only a few dozen games per year that experts could reasonably disagree about the likely winner. Picking winners straight up may not be the best way to test football expertise because it would take several years of games for true expertise to overcome random luck. (But I'm not convinced that picking against the spread is any better).
I experimented with a linear regression that predicted actual point spreads. I also tried a logistic regression (binary outcome regression) of whether a team would beat the spread. Neither worked very well. The method I ultimately stumbled upon predicted correctly against the spread for about 60% of the 2007 regular season, which seems to be pretty good. According to this site that tracks over 50 handicappers and systems, my method would come out on top. It's only one season of results, but there is a solid theoretical foundation to why it works.
One of my recent posts discussed how defensive interceptions appear to be totally random. Teams cannot sustain high or low interception rates throughout the season. I thought that if handicappers and bettors are factoring in defensive interceptions in their spreads, than you'd do well to bet against teams with high to-date interception rates. Unfortunately, that theory was a bust. It beat the spread only 50% of the time.
Then I remembered a comment from a reader early in the season last year about how teams that win outright also beat the spread over 80% of the time. (It's actually 82.7% over the past six years.) So a system that picks SU winners 70% of the time could beat the spread over 70%*80%=56+% of the time. It turns out that my SU model is actually a pretty good ATS system too.
(This applies to the 2007 regular season weeks 4 through 16. I needed a few weeks of data for reasonable predictions, and I excluded week 17 because of its notorious unpredictability due to playoff teams resting their starters. (There's no way Troy Smith leads the 2007 Ravens over the Steelers in any other week.))
The table lists the number of games and the accuracy of my SU model in predicting the ATS winner for each spread size. For example, in 2007 there were 31 games when the spread was either +3 or -3. My SU model was correct in predicting the ATS winner in 61% (19) of those games. The cumulative number of games and ATS accuracy is also listed.
| Spread | Accuracy | Count | Cm. Count | Cm. Accuracy |
| 0 | 0.50 | 4 | 4 | 0.50 |
| 1 | 0.83 | 6 | 10 | 0.70 |
| 1.5 | 0.71 | 7 | 17 | 0.71 |
| 2 | 1.00 | 3 | 20 | 0.75 |
| 2.5 | 0.25 | 8 | 28 | 0.61 |
| 3 | 0.61 | 31 | 59 | 0.61 |
| 3.5 | 0.56 | 27 | 86 | 0.59 |
| 4 | 0.83 | 6 | 92 | 0.61 |
| 4.5 | 0.50 | 6 | 98 | 0.60 |
| 5 | 0.00 | 1 | 99 | 0.60 |
| 5.5 | 0.80 | 5 | 104 | 0.61 |
| 6 | 0.33 | 9 | 113 | 0.58 |
| 6.5 | 0.80 | 5 | 118 | 0.59 |
| 7 | 0.56 | 9 | 127 | 0.59 |
| 7.5 | 0.75 | 4 | 131 | 0.60 |
| 8 | 0.50 | 6 | 137 | 0.59 |
| 8.5 | 0.50 | 4 | 141 | 0.59 |
| 9 | 0.80 | 5 | 146 | 0.60 |
| 9.5 | 0.80 | 10 | 156 | 0.61 |
| 10 | 0.33 | 6 | 162 | 0.60 |
| 10.5 | 0.33 | 9 | 171 | 0.58 |
| 11 | 0.00 | 1 | 172 | 0.58 |
| 11.5 | 1.00 | 2 | 174 | 0.59 |
| 12.5 | 1.00 | 2 | 176 | 0.59 |
| 13 | 1.00 | 1 | 177 | 0.59 |
| 13.5 | 1.00 | 1 | 178 | 0.60 |
| 14 | 1.00 | 1 | 179 | 0.60 |
| 14.5 | 0.00 | 1 | 180 | 0.59 |
| 16 | 0.75 | 4 | 184 | 0.60 |
| 16.5 | 1.00 | 1 | 185 | 0.60 |
| 17+ | 0.00 | 4 | 189 | 0.59 |
Overall, the SU model is 59% against the spread. Although that's pretty good as is, if we cut off predictions at spreads of 5 or 5.5 points, the model is 60 to 61% accurate ATS. We could go out to 9.5 points and still have 60% accuracy. That includes 156 games, hopefully enough 'action' for those gambling enthusiasts out there.
Keep in mind I just stumbled on this, so the model is not optimized for ATS predictions. By tweaking it here and there, or better refining which games have the highest probability of beating the spread, the ATS accuracy could be further improved.
I realize there are thousands of handicappers out there who claim higher accuracy rates. But personally, I don't believe them. Statistical slight of hand is not hard, especially when you control access to the data. I'd guess that most of them are cherry-picking results after the fact. Here's one example:
1. A handicapper always has 3 sets of picks, say his "guaranteed plays," his "locks of the week," and "his gold specials." He doesn't necessarily advertise which picks are in which category.
2. As the season goes on, naturally one of the three sets will outperform the others. Even if his picks overall are worse than 50%, there is a very good chance one set of picks will be better than 50%.
3. On his website, the handicapper advertises that his "guaranteed plays" are 21-13...over 60% accurate!
Another factor is Darwinism. There are new handicappers that hang up their shingle all the time. Naturally, just by luck alone, half will be better than 50% and half will be worse than 50%. The unlucky ones will tend to close up shop, leaving the luckier handicappers in business and claiming to be geniuses. (This likely also applies to fund managers on Wall Street, which is one reason I prefer index funds.)
This result is only based on one year of games, but I'm very confident it is repeatable. For 2006, I ran a very similar model and it beat the spread as a SU predictor by several percentage points. That's really all you need. If you can significantly beat the spread as a SU predictor, you'll have about a 60% accuracy ATS.
Well, you commented, and I listened. There's my ATS post. I've focused on gambling enough for a while. I feel like I need a shower. Like I said, it could probably be even more accurate with some tweaks, but my attention over the next few weeks will be on the upcoming draft.
7 comments:
Good work! I wouldn't have thought that 59% was possible.
You're not going to describe your method?
Phil-You'll have to subscribe to my premium site if you want the method. Just kidding.
Actually, the method is just using my same old SU win prediction model as an ATS model. Pick SU winners to beat the spread--that's it.
My SU model is a logit regression using net passing efficiency, run efficiency, turnover rates, penalty rates, and home field advantage.
The general description of the model is here, and the detailed description is here.
Wow, you're getting 60% just by picking winners to beat the spread?
But ... but ... I'm confused.
Consider heavy favorites, +10 or more, say. Every system in existence would pick them straight-up, right? So what you're saying is that heavy favorites regularly beat the spread with 60% probability?
Yes and no.
Yes--just use outright winner predictions to beat the spread. All you need a system that picks winners slightly better than the spread does. Mine does, Sagarin's, others' probably can too.
Winners, whether they were the favorite or underdog, cover the spread 83% of the time.
BUT...good point about heavy favorites. For some reason I always thought that very heavy favorites rarely covered, which is why I made the table with cumulative accuracy. I was looking for a point at which the accuracy really drops off. But for 2007 (my test set), favorites of 10+ points covered 14 out of 27 games. So there was no real drop off.
However, over the past 6 years, favorites of 10+ points covered only about 40% of the time (n=99). This suggests 2007 might be a slight aberration. So limiting your bets to games with more modest spreads might enhance accuracy.
One additional note: This was tested against closing lines, which are slightly more accurate than opening lines. I'd guess that if you bet early in the week, you might increase the accuracy by another percentage point or 2. Team stat-based systems are as accurate as they're going to get each week by Tuesday morning.
Also, by staying away from games where stat-based systems are at a known disadvantage, such as when a star QB or other player is hurt, the system might improve. The trick would be to have sound objective criteria so personal biases don't creep in to the decision. (You wouldn't be switching the bet, just staying away from the game.)
This is a very interesting article. I like the approach you take on how to beat the point spread. This is smart.
You tried to find the point spread threshold that gives the best ATS accuracy. Did you try the same thing but this time the variable would be the probability returned by your SU model? Maybe all games with a 0.65 or more probability to win SU will cover the spread 75% of the time.
Post a Comment