Sep 28, 2007

NFL Win Prediction Methodology

Throughout the rest of the 2007 season I intend to publish win probabilities for each game, and season win projections for each team. This post explains the methodology used to calculate these.

Based on a logit regression of every game played for the past 5 seasons, a mathematical model was established to determine the probability each opponent would win a game. The model is based on team efficiency stats which include:

  • Offensive pass efficiency, including sack yardage
  • Defensive pass efficency, including sack yardage
  • Offensive run efficiency
  • Defensive run efficiency
  • Offensive interception rate
  • Defensive interception rate
  • Offensive fumble rate
  • Penalty rate (penalty yards per play)
Home field is also included in the model. These factors were selected because they are most predictive of future performance, and not necessarily because they explain past performance. Over the past five seasons, the model predicts winners correctly in 69.8% of games (retrospectively). In 2006, the model was correct in 65% of games, well ahead of consensus favorites as determined by betting lines. Last year was particularly difficult year for prognosticators, as consensus favorites only won 57% of the games.

Touchdowns, or red zone performance, or third down success rates are not used in the model because I believe those things are the results of passing and running ability etc. To include them in a model intended for prediction would guarantee it is severely "overfit." In other words, it would capture and explain the unique qualities of past events at the expense of predictive power.

Once the model is established, each game's outcome probability can be calculated. But there are other applications. By calculating the probability a team will win against a notional league-average team at a neutral site, a generic win probability can be determined for each team.

This year the model includes an adjustment for opponent strength. This is especially important earlier in the season when there are fewer data points to establish each team's baseline performance levels. Each opponent's generic win probability is averaged for each team. It is then included back into the win model to refine each prediction. For example, a team with impressive stats against weak teams would not be favored as strongly as a team with similar stats against strong teams.

Another application of the opponent-adjusted generic win percentage is a ranking of each team. Such a ranking is similar to the now ubiquitous "power rankings." A better term for the rankings on this site would be "efficiency rankings."

Lastly, final win totals can be estimated by calculating the probabilities of a team's future games. By using the law of total probability, the probility that each possible final record will occur can be determined. For example, if a team has two games left, one with a 0.7 chance of winning and one with a 0.5 chance of winning, the probability of winning 0, 1, or 2 games can be calculated.

2 wins = 0.5 * 0.7
1 win = 0.5 * (1-0.7) + (1-0.5) * 0.7
0 wins = (1-0.7) * (1-0.5)

The same math can be applied with many more games to go but becomes far more complex. Then once we determine the most likely winning percentage for each team, we can compare those expected values to actual outcomes to determine which teams have been lucky or unlucky.

Lastly, as playoff time approaches, we can go one step further. By applying the same principal of total probability, the outcomes of playoff races can be estimated.

Note: The actual game prediction model and coefficients can be found here.

2 comments:

Tarr said...

Rather than assuming that including something like, say, first downs or TDs will overfit the model, why not test for he predictive power of these statistics?

I sense some disdain for the FO methodology here. It's true they haven't done (or at least they haven't claimed to have done) any rigorous satistical analysis of the factors they consider. But they do claim that all changes to the model are tested by whether they improve the correlation of the statistics from one year to the next, or the correlation of last years DVOA to next year's wins. They are not testing correlation of this year's stats to this year's wins, which would obviously lead to severe overfit.

Brian Burke said...

I agree for the most part. I'm working on something just as you suggest, but I'm letting the season generate some more data before finalizing it or posting anything.

I'm building a model around series success rates (SSR). It's the percentage a team gets a 1st down in any given series, or prevents one on defense. The average rate is 65% in the NFL. I would think that each teams offensive and defensive SSR is a very simple, handy method of capturing a lot of data about a team. One way or another it captures run and pass efficiencies, turnovers, sacks, penalties, and coaching tactics.

So far, however, it's not as predictive as efficiency stats.

Some of FO's stuff is really good, but some of it leaps to conclusions after a couple interesting correlations.