Book excerpt: Wertheim's Scorecasting

Updated: January 24, 2011, 11:36 AM ET
By L. Jon Wertheim | Special to

In their "Freakonomics for sports" book, Scorecasting, Tobias J. Moskowitz and L. Jon Wertheim challenge conventional wisdom, uncover the hidden influences in sports and uses reams of data to investigate questions that tug at every fan. Are there really make-up calls in the NBA? Is there, in fact, a home field advantage? Is there really no 'I' in team?

In the following excerpt, the authors -- one a University Chicago economist, the other a writer at Sports Illustrated --consider why "four of his last five" means "four of his last six."

Damned Statistics

At some point it became almost cartoonish. As though he wasn't shooting the basketball so much as he was simply redirecting his teammates' passes into the hoop. In the first half of the second game of the 2010 NBA Finals, Ray Allen, the Boston Celtics' veteran guard was& well, the usual clichés -- "on fire," "unconscious," "in the zone" -- didn't do it justice. Shooting with ruthless accuracy, Allen drained seven three-pointers, most of them bypassing the rim and simply finding the bottom of the net. Swish. Swish. Swish-swish-swish. In all, he scored 27 points in the first half. Celtics reserve Nate Robinson giddily declared Allen, "The best shooter in the history of the NBA."

As Allen fired away, the commentators unleashed a similarly furious barrage of stats, confirmed by the graphics on the screen. The shooting was cast in the most glowing terms possible. Allen, viewers were told at one point, had made his last seven shots. When he missed a two-pointer (turns out he was only 3 for 9 on two point attempts), the stats suddenly focused only on the three-pointers.

It was, of course, inevitable that Allen would cool off. And he did in the second half, making only one three-pointer although his eight treys for the game became a new NBA Finals record and his 32 total points enabled Boston to beat the L.A. Lakers 103-94. But he really cooled off in his next game. This time he was ruthless in his inaccuracy, missing all 13 of his shots, including eight three-point attempts, as Boston lost 91-84. As Allen clanged shot after shot, the commentators were quick to note this whiplash-inducing reversal of fortune, framing it in the most damning possible terms. At one point viewers were told that, between the two games, Allen had missed 18 straight attempts.

Inasmuch as sports fans are tricked by randomness, the media shares in the blame. Statistics and data are the forensic evidence of sports. But like all pieces of evidence, they can be mishandled and tampered with. We are bombarded by stats when we watch games. But the data is chosen selectively, often focused on small samples and short-term numbers. When we're told that a player has reached base in "four of his last five at-bats," we should assume right away that it's four of his last six. Otherwise, surely we'd have been told that the streak was five out of six. Clearly a team that "has lost three in a row," has dropped only three of their last four -- and possibly three of five or three of six or … otherwise it would have been reported as a four-game losing streak.

The sports media, have an interest in selling the most extreme scenario. Collectively, they pick and choose data accordingly. Take, for instance, a September 15, 2009 game between the New York Yankees and the Toronto Blue Jays, a showdown between Alex Rodriguez and Roy Halladay, arguably the league's best hitter and best pitcher at the time. The Yankees broadcasters might have framed the encounter along the following lines, using the most positive statistics at their disposal:

Rodriguez steps to the plate. He's hitting .357 against Halladay this season, including five hits in his last 12 at-bats against the big righty, a .412 clip. Over his last eleven games, A-Rod is hitting .436. Remember that, as trade rumors swirl, Halladay has lost four of his last five starts and 11 of his last 15.

Upon receiving this information, it sounds almost like a foregone conclusion that A-Rod is going to crush the ball. One almost feels pity for Halladay.

Now listen to how the Toronto broadcasters might have addressed the showdown using the best available statistics to make their case:

Halladay comes in having pitched two straight complete games. Over those 18 innings, he struck out 18 men and gave up only four earned runs, a 2.00 ERA. Meanwhile, A-Rod is hitless in his last six at-bats against Halladay. Among all opposing teams, Rodriguez has the lowest average -- and strikes out the most -- against the Blue Jays.

After hearing this we'd be surprised if Rodriguez made contact off Halladay, much less reached base.

Both renderings would have been perfectly accurate. Both sets of statistics are true. Yet they paint radically different pictures. Incidentally, in that Yankees-Blue Jays game, Halladay pitched six innings, allowed two earned runs and got the win; Rodriguez was one-for-three with a double against Halladay -- pretty much what a neutral observer, ignoring the noise and looking at as much data as possible, would have predicted.

Teams are complicit in this selectivity, too. Check the scoreboard next time you're at a baseball game, for instance. Had you attended a White Sox-Tigers game at U.S. Cellular Field in the summer of 2010, you could have learned that Chicago's outfielder Carlos Quentin was "hitting .371 over his last nine games." While this was impressive and meant to convey a hot streak, it told us … what exactly? Not much, not with a sample size that small. If the White Sox were attempting to predict the outcome of Quentin's next at-bat, they would have provided a more meaningful statistic, using a larger data set. But noting that Quentin was "351 for 1,420 (.247) for his career" doesn't quite stir passion.

When Nate Robinson declared Ray Allen the best shooter in the annals of the NBA, he may have been right. But not because Allen had one torrid shooting game. Otherwise, you could just as easily make the case that, based on the following night's game, Allen was also the worst shooter in NBA history. Robinson's more convincing evidence: for his NBA career Allen has taken more than 6,000 three-point attempts and made roughly 40 percent of them.

Those two games of extremes during the 2010 NBA Finals? Unsexy as it might have been to use the largest available data set and note Allen's career average, it would have helped the viewers. Between the two games, he was 8-of-19 or 42 percent, conforming almost exactly with his career mark.

Back to Page 2

• Philbrick: Page 2's Greatest Hits, 2000-2012
• Caple: Fond memories of a road warrior
• Snibbe: An illustrated history of Page 2
Philbrick, Gallo: Farewell podcast Listen