Early in a scoreless game, a quarterback throws a 20-yard pass just by the reaching arms of a defender and into the hands of his intended receiver, who holds on despite the distraction, then scampers the remaining 15 yards for a touchdown.
Another quarterback, down 30-10 with five minutes left in the fourth quarter, throws a 3-yard screen pass to a running back, who maneuvers another 32 yards through prevent defense to pick up a first down deep in opponent territory.
Both are called good plays, but labeling them as "good" isn't enough. Each play has a different level of contribution to winning, and each play illustrates a different level of quarterback contribution. What is the quarterback's contribution to winning in each situation? Coaches want to know this; players want to know this; and fans want to know this.
The Total Quarterback Rating is a statistical measure that incorporates the contexts and details of those throws and what they mean for wins. It's built from the team level down to the quarterback, where we understand first what each play means to the team, then give credit to the quarterback for what happened on that play based on what he contributed.
At the team level, identifying what wins games is not revolutionary: scoring points and not allowing points. Back in the 1980s, "The Hidden Game of Football" did some pioneering work on that topic and on how yardage relates to points. We went back and updated what that book did then we went further. At the individual level, more detailed information about what quarterbacks do is really necessary. Brian Burke at AdvancedNFLStats.com has done very good work in advancing that effort, and FootballOutsiders.com has done some of this by charting data, but, for the past three years, ESPN has charted football games in immense detail. By putting all these ideas together and incorporating division of credit, we have built a metric of quarterback value, the Total Quarterback Rating, Total QBR or QBR for short.
What follows is a summary of what goes into QBR. It took several thousand lines of code to implement, but we'll keep this shorter.
Win Probability and Expected Points
The goal behind any player rating should be determining how much a player contributes to a win. We went back through 10 years of NFL play-by-play data to look at game situation (down, distance, yard line, clock time, timeouts, home field, field surface and score), along with the ultimate outcome of the game, to develop a win probability function.
This function treats every win the same, regardless of whether it was 45-3 or 24-23, though there is clearly a difference between such games. The first game represents total domination, whereas the other represents two fairly evenly matched teams. Because win probability treats every win the same, it misses some of what goes into the win, specifically many of the points that represent domination or the points that lead up to a last-second victory. So, although QBR uses win probability to assess how "clutch" a situation is, it uses expected points as the basis of evaluating quarterbacks. It has more of the details, and understands the difference between wins, but still strongly relates to wins in general.
The concept of expected points was discussed as early as the mid-1980s with Pete Palmer & Co. and "The Hidden Game of Football," in which they talk about "point potential." Their idea was that, as you move closer to the opponents' end zone, you are actually gaining points. Brian Burke took it further to note that third-and-10 from midfield, for instance, has fewer expected points than first-and-10 from midfield. In other words, down and distance also matter in terms of points. We took this even further to look at clock time, home field, timeouts and field surface to generate the expected points for any team given its situation in a drive. One particular situation to note is that, at the end of the half, a team is less likely to score any points than at most other times of the game, just because the half is going to expire.
It's useful to mention here that expected points are expected net points. It's possible that a team has expected points less than 0. This simply implies that the other team is generally more likely to score. This usually happens when a team is backed up deep in its own side of the field, especially if it is third or fourth down.
What then happens is an evaluation of expected points added. How does a team go from 1.1 expected points to 2.1? However it does it, that is 1.0 expected points to be distributed to the offensive players on the field. But how the team does it is what determines how credit is given to a quarterback.
Division of credit is the next step. Dividing credit among teammates is one of the most difficult but important aspects of sports. Teammates rely upon each other and, as the cliché goes, a team might not be the sum of its parts. By dividing credit, we are forcing the parts to sum up to the team, understanding the limitations but knowing that it is the best way statistically for the rating.
On a pass play, for instance, there are a few basic components:
• The pass protection
• The throw
• The catch
• The run after the catch
In the first segment, the blockers and the quarterback have responsibility for keeping the play alive, and the receivers have to get open for a QB to avoid a sack or having to throw the ball away. On the throw itself, a quarterback has to throw an accurate ball to the intended receiver. Certain receivers might run better or worse routes, so the ability of a QB to be on target also relates somewhat to the receivers. For the catch, it might be a very easy one where the QB laid it in right in stride and no defenders were there to distract the receiver. Or it could be that the QB threaded a needle and defenders absolutely hammered the receiver as he caught the ball, making it difficult to hold on. So even the catch is about both the receiver and the QB. Finally, the run after the catch depends on whether a QB hit the receiver in stride beyond the defense and on the ability of a receiver to be elusive. Whatever credit we give to the blockers, receivers and quarterback in these situations is designed to sum to the team expected points added.
The ESPN video tracking has been useful in helping to separate credit in plays like these. We track overthrows, underthrows, dropped passes, defended passes and yards after the catch. The big part was taking this information and analyzing how much of it was related to the QB, the receivers and the blockers. Not surprisingly, pass protection is related mostly to the QB and the offensive line, but yards after the catch is more about what the receiver does. Statistical analysis was able to show this, and we divided credit based on those things.
As a relevant side note, statistical analysis showed that what we call a dropped pass was not all a receiver's fault, either. A receiver might drop a ball because he wanted to run before catching it, because the defense distracted him, because it was a little bit behind him or because he was about to get hit by a defender. If the defender was there a half second before, the defender would have knocked the ball free and it would have been called a "defended pass," not a "dropped pass." There are shades of gray even on a dropped pass, and analysis showed that. Drops are less a QB's fault than defended passes or underthrows, but the QB does share some blame.
On most other plays, quarterbacks receive some portion of credit for the result of the play, including defensive pass interference, intentional grounding, scrambles, sacks, fumbles, fumble recoveries (Carson Palmer once recovered a teammate's fumble that saved the game for the Bengals) and throwaways.
On plays when the QB just hands off to a running back, we didn't assign any credit to the QB. Our NFL experts did suggest that some QBs are very good at interpreting defenses pre-snap and identifying better holes for their backs. However, they also told us it would be nearly impossible to incorporate. Because they suggested this, we built in the ability to give credit for QBs when they just handed off, but we couldn't find the right analysis to do it in 2011.
The final major step is to look at how "clutch" the situation was when creating expected points. A normal play has a clutch index of 1.0. For instance, first-and-goal from the 10-yard line in a tie game at the start of the second quarter has a clutch index of almost exactly 1.0. A more clutch situation, one late in the game when the game is close -- the same situation as above but midway through the fourth quarter, for example -- has a clutch index of about 2.0. Maximum clutch indices are about 3.0, and minimum indices are about 0.3.
These clutch index values came from an analysis of how different situations affect a game's win probability on average. One way to think of it is in terms of pressure. A clutch play is defined before the play by how close the game appears to be. Down four points with three seconds to go and facing third-and-goal from the 3-yard line -- that is a high-pressure and high-clutch index situation because the play can realistically raise the odds of winning to almost 100 percent or bring them down from about 40 percent to almost zero percent. The same situation from midfield isn't as high pressure because it's very unlikely that the team will pull out the victory. Sure, a Hail Mary can pull the game out, but if it doesn't work, the team didn't fail on that play so much as it failed before then. On third-and-goal from the 3-yard line, failure means people will be talking about that final play and what went wrong.
The clutch indices are multiplied by the quarterback's expected points on plays when the QB had a significant contribution, then divided by the sum of the clutch indices and multiplied by 100 to get a clutch-valued expected points added per 100 plays.
A Rating from 0 to 100
The final step is transforming the clutch-valued expected points rate to a number from 0 to 100. This is just a mathematical formula with no significance other than to make it easier to communicate. A value of 90 and above sounds good whether you're talking about a season, a game or just third-and-long situations; a value of four or 14 doesn't sound very good; a value of 50 is average, and that is what QBR generates for an average performance.
That being said, the top values in a season tend to be about 75 and above, whereas the top values in a game are in the upper 90s. Aaron Rodgers might have gone 31-of-36 for 366 yards, with three passing TDs, another TD running, 19 first-down conversions, and eight conversions on third or fourth down in one game -- for a single-game Total QBR of 97.2 -- but he can't keep that up all year long. Pro Bowl-level performance for a season usually means a QBR of at least 65 or 70. We don't expect to see a season with a QBR in the 90s.
With this rating, we have intentionally not adjusted for opponents. This doesn't mean that we won't adjust for opponents as we use it but that we want QBR to be flexible for many purposes, and keeping opponents' strength out gives us that flexibility. As it stands, QBR can be broken down for all sorts of situations -- red zone, third-and-long, throwing to a certain receiver, in bad weather, against different defensive formations. We didn't want to muddy it up with opponent adjustments that aren't as useful for those situations. How to implement a defensive adjustment for third-and-long also might be different from one for the whole season. Beyond this, a defensive adjustment is often not a constant factor. A defense that looks good in Week 4 might not be as good after a few more weeks. Because it isn't a constant thing, it makes sense to leave that for analysis rather than constant incorporation into QBR.
There will be analyses that we do on ESPN that will suggest the use of an opponent adjustment, but we will do that when needed, not up front.
What underlies QBR is an understanding of how football works and a lot of detailed situational data. What it yields are results that should reflect that. It illustrates that converting on third-and-long is important to a quarterback. It shows that a pass that is in the air for 40 yards is more reflective of a quarterback than a pass that is in the air for 5 yards and the receiver has 35 yards of run after the catch. These premises should sound reasonable to football fans. They come out of a lot of statistical analysis, but they are also consistent with what coaches and players understand.
As we neared the end of the development of QBR, we talked to Ron Jaworski and Greg Cosell at NFL Films about its evolution. Cosell said at one point, "Football is not complex, but it is very detailed." I realized then that QBR is like that. It is very detailed, accounting for a lot of different situations, but it is not particularly complex. It really does try to see the game the way we have gotten used to seeing it in its elegant simplicity. We hope you, the fan, appreciate it, as well.
Dean Oliver is one of the pioneers in sports analytics. Author of "Basketball on Paper," the standard for doing analytics in basketball, Oliver applied his work to personnel and coaching matters for five successful years in the front office of the Denver Nuggets. Oliver is the director of analytics for ESPN. He joined the company in 2011 to build the analytics group, which works across a number of sports.