NFL Outcome Predictor

How it’s built →

How our model did predicting the 2025 NFL season. Before each game it estimated each team's chance of winning — using only what was known at the time — and we compare those guesses to what actually happened and to the Vegas odds.

 

 

How to read this page

This page shows how our prediction model did across the entire 2025 NFL season. Before each game, every model named a favorite and a win chance (e.g. “Eagles 65%”), using only information known beforehand. We then grade those guesses against what actually happened — and against the Vegas odds, the gold standard.

The lines / rows you'll see:

  • Our model (QBElo)this is our prediction. A power ranking like chess rankings, adjusted when a backup quarterback starts.
  • Vegas — the betting odds as a clean win %. The benchmark we measure ourselves against (nobody really beats it).
  • Elo and ML — two simpler/alternative versions we built for comparison.

The grades:

  • Accuracy — how often it picked the right winner. Higher is better.
  • Brier / Log loss — how good the percentages were (saying “90%” and being wrong is penalized hard). Lower is better.
  • Calibration (the chart below) — when a model says “70%”, do those teams really win about 70% of the time? On the diagonal line = honest.

Bottom line: the homemade models get close to Vegas but don’t beat it — the honest, expected result.

How it works

1. The model learns in a loop

Every team has a power rating Predict each game's win % Games are played Grade the guess Nudge ratings

Beat a strong team and your rating jumps; lose to a weak one and it drops. Repeat every week and the ratings sort the league into a power order — no human opinions needed.

2. Turning two ratings into a win chance

The core trick: take the gap between two teams' ratings and bend it into a probability. Drag the slider.

→ Stronger team's win chance:

3. A real game from 2025

4. Watch a team's power rating over the season

It rises after wins (especially big ones over good teams) and falls after losses. 1500 = average; higher = stronger.

Season totals

loading…

Model Brier ↓ Log loss ↓ Accuracy ↑ Games

Brier and log loss grade the percentages (lower = better); accuracy is just how often the winner was picked. Market = the Vegas odds (the ceiling); baseline = always guess the home team.

Track record through the season

Each line is a model's running grade on its percentages (Brier — lower is better). Vegas (red) stays the best; the homemade models bunch up just above it.

Weekly breakdown

Click a week to see every game's prediction.

How it was built

The point of this project isn't a secret formula — it's the process: build progressively smarter models and measure each one honestly, never peeking at the future, always benchmarked against Vegas. See the full pipeline & data flow →

The climb — each model, scored on games it had never seen (2019–2024)

Win-pick accuracy (50% = a coin flip). Each step adds sophistication and gets closer to Vegas — none beats it.

The pipeline

Free NFL data (nflverse) Build features (Elo, QB value, rolling EPA) Train + back-test (walk-forward, no leakage) Score vs Vegas Live tracker (this page)

Build decisions that matter

  • No peeking at the future. Every prediction uses only games played before it — the #1 way projects like this accidentally cheat, and the easiest to get wrong.
  • Tried machine learning, reported the truth. Gradient-boosted and logistic models on EPA / rest features didn't beat the simple QB-aware rating. Shown honestly, not buried.
  • Benchmarked against the closing Vegas line — the sharpest, hardest number to beat.
  • Built from scratch. The rating system, the quarterback-value formula, and the calibration step are all hand-coded — no off-the-shelf prediction library.