FORTY-NINE YEARS AGO, Sports Illustrated profiled an engineering professor with jolting theories about baseball. Earnshaw Cook, author of the then-unreleased Percentage Baseball, claimed that advanced math and objective analysis could turn any team into a pennant winner. He didn't call it sabermetrics and he didn't call it Moneyball, but that's essentially what Cook was proposing.
Also 49 years ago, a man in Wilmington, Del., named Herb Groh wrote a letter to the magazine in response. "Thank goodness the game of baseball can never be reduced to adding-machine accuracy," he wrote. "It's much more fun this way."
Groh's letter speaks for many, especially today. Cook's book, meanwhile, speaks for almost nobody: Many of his findings turned out to be wrong and would have actually cost teams wins. But science rarely gets it right the first time; it gets it right over time. Wins Above Replacement -- an all-encompassing measure of a player's value developed through decades of data and debate by baseball's army of amateur analysts -- gets it right.
In 2012 WAR took on a co-headlining role in the American League's MVP race. For Mike Trout supporters, WAR was simple and unimpeachable evidence of a perfect player performing at a nearly unprecedented level. For Miguel Cabrera supporters, WAR was the joyless and inscrutable tool of eggheads, trolls, all of us who never played the game. Cabrera vs. Trout was often reduced to a referendum on the value of data. "This WAR statistic is another way of declaring, 'Nerds win!'" best-selling author Mitch Albom wrote in defense of Cabrera.
Albom has it wrong. At the risk of grandiloquence, this is about more than one MVP race, about more than even baseball. We live in a world of disagreement on epochal issues that we can't resolve even when the science is unambiguous: evolution, vaccines and climate change among them. These issues are daunting. Relying on science that's hard to understand can be scary. So the tendency is to cling to the comforts of ideology and tradition -- even when those ideologies are wrong, even when the traditions are outdated.
Fight it if you like, but baseball has become too complicated to solve without science. Every rotation of every pitch is measured now. Every inch that a baseball travels is measured now. Teams that used to get mocked for using spreadsheets now rely on databases packed with precise location and movement of every player on every play -- and those teams are the norm, not the film-inspiring exceptions. This is exciting and it's terrifying.
Even though I'm a staff writer and editor at Baseball Prospectus, I'm not going to try to convince you that Mike Trout should have beaten Miguel Cabrera for the MVP award. WAR, despite what you might have read, does not take a position on that. But I will try to convince you that WAR represents a chance to respond to the complexity of baseball with something more than ideology or despair.
And fear not, Herb Groh of Wilmington, Del. WAR also makes baseball much more fun.
1. WAR TELLS A GREAT STORY.
A single baseball at-bat is magnificently complex, a single game exponentially more so, a team's season more than that and a player's career more than that. WAR uses the most advanced available data to measure, in each area of performance, how many runs a major league player saved or produced relative to a consistent baseline: the runs likely saved or produced by an average minor leaguer called up as a hypothetical "replacement." It expresses those runs as wins -- about 10 runs to one win -- and calls it a player's "worth" over a year. It takes all of baseball's amazing intricacy and sums it up in a number. Not even a colorful number. Usually a boring number, like 1.3. There's nothing interesting about 1.3.
Reducing a player's worth to a single number can be contemptible, says John Thorn, a seminal sabermetric writer and the author of the 1984 book The Hidden Game of Baseball. That book introduced the Linear Weights System, which attaches a value in runs to every offensive event. (For instance, a single when the book was released was worth 0.47 of a run.) Linear Weights System provides the mathematical basis for WAR's offensive components. Thorn, while supportive of WAR, criticizes the way it is often deployed to end an argument.
"The current lowest common denominator of statistical writing is the fixation on comparing Player A with Player B, which seems to me not only worthless but serves to obscure the larger story of baseball," Thorn says. "Enjoyment of baseball is like enjoyment of art. If you decide it has to have a utilitarian function … you make it seem like work. It's supposed to be play."
But Thorn's complaint is about the user, not about the tool. Loving baseball and loving WAR go together perfectly.
At three hours per game, 162 games a year, 10 or 20 or 80 years, many of us spend tens of thousands of hours on the sport -- and that's not even getting into the fantasy-team managing and card collecting and this-article reading and all the other stuff that a baseball fan does instead of doing his job and mowing the lawn. Few of us will ever make a dime in the game, so why do we watch? For the same reason we love sitcoms and movies: We grow through the story -- struggle, redemption, connection.
WAR tells a new story about baseball. Better, WAR shows that new story, because it embeds every part of the game within its formula. Consider shortstop David Eckstein. The mainstream story about Eckstein -- he's small and not technically very good, but boy does he have grit -- was told through adjectives, not facts. At the media-criticism site Fire Joe Morgan, there was a David Eckstein category comprising 20 separate posts on Eckstein hagiographies. That's nearly 12,000 (hysterical) words mocking the reporters who celebrated the plucky Eckstein despite his weak arm, punchless bat and general failure to be athletic.
Now, here's the twist: David Eckstein was actually very valuable, and it had nothing to do with the adjectives. In 2002 Eckstein (WAR of 4.4, according to analytics-based website FanGraphs) was almost as good as Miguel Tejada (WAR of 4.7), who won the AL MVP award that year. Tejada hit 34 home runs and drove in 131. But Eckstein was nearly his equal while driving in 63 and taking a running start every time he threw to first. How? WAR, and the components that it comprises, tells us:
1. Eckstein let himself get hit by 27 pitches, giving him a better OBP than Tejada and blunting Tejada's power advantage.
2 . Eckstein hit into a third as many double plays.
3. Eckstein was actually a good defensive shortstop with more range than Tejada and more success turning double plays.
A writer who wanted to praise Eckstein, then, could have made some assumptions about Eckstein based on his height, weight and skin color (white), collected some flattering athlete-cliche quotes from Eckstein's teammates and flipped through his thesaurus looking for new words -- thaumaturgical! leptosome! -- to describe the little guy. Or he could have started with WAR and explained how David Eckstein, ballplayer, was good at playing ball.
During the AL MVP debate last season, WAR didn't just hang a number on each player -- it once again revealed a story. Trout was the best hitter in the American League (54 runs better than the average player, according to the website Baseball-Reference, compared with Cabrera's 52) and the best baserunner in the American League (10 runs better than the average player, compared with Cabrera's 0) and the third-best fielder in the American League (21 runs saved to Cabrera's minus-4). No non–Hall of Fame hitter in history besides Barry Bonds has produced more WAR in a season than Trout's 10.7. No player in history has been a plus-10 runner and a plus-20 fielder and a plus-50 hitter. Arguably no player's greatness has ever been as well-rounded as Trout's was in 2012. That's a pretty awesome story, and it is one we'll tell our grandchildren we saw.
2. WAR DEMANDS FAITH.
The business of measuring baseball players -- by writers, agents and teams -- gets more exhaustive every year. WAR, by attempting to factor in every measurable part of the game, not only reflects that flood of data but also carries within it every step of the stat's evolution.
Like most advancements in baseball research, WAR in all probability started with Bill James, who in the mid-1980s gradually began comparing players to a low, consistent baseline in his annual Bill James Baseball Abstract. A decade later, sabermetrician Keith Woolner developed Value Over Replacement Player, which was adopted by Baseball Prospectus and measured offensive prowess compared with others at the same position. Over the next 15 years, that site and then FanGraphs and Baseball-Reference introduced their own versions, gradually incorporating defense and baserunning, hitters' tendencies to hit into double plays, and adjustments for league difficulty and offensive environments, while also tweaking their definition of the replacement-level player. As the math was continually refined by peer-reviewed research, sabermetric books and websites, so emerged one of the most challenging criticisms of WAR: that it is inscrutable to fans who are not high-level mathematicians.
Sean Smith was a high school pitcher in Louisville in the 1980s when he read James' writing. He ultimately started developing his own formula, and his work became Baseball-Reference's original algorithm for WAR. It has since been modified so that even he can't calculate WAR the way you or I could calculate batting average, from scratch. "You can't calculate UZR," Smith says, referring to the defensive metric FanGraphs uses for its WAR. "They're getting data that's not available publicly. Baseball-Reference is using Defensive Runs Saved, which comes from Baseball Info Solutions, which is not public; the detailed play-by-play isn't available." And the parts of WAR that are available -- each site's metrics on hitting performance, pitching performance, park factors, strength of opponent and league baselines for baserunning and double plays -- are too advanced and/or unwieldy for most fans to realistically figure out on their own.
I'm not a mathematician and I'm not a scientist. I'm a guy who tries to understand baseball with common sense. In this era, that means embracing advanced metrics that I don't really understand. That should make me a little uncomfortable, and it does. WAR is a crisscrossed mess of routes leading toward something that, basically, I have to take on faith.
And faith is irrational and anti-intellectual, right? Faith is for rain dances and sun gods, for spirituality but not science. Actually, no. Faith is how we organize a complicated modern world. Faith is what you have when your doctor walks in with a syringe filled with something that could be anything and tells you that it'll keep you from getting the measles. Unless you're a doctor or a medical scientist, you don't really understand vaccines, and you certainly can't brew one up at home. You have outsourced the intellectual side of your health to people who, your faith reassures you, are smarter than you. Maybe in one way of looking at it you're not as smart as your great-great-great-grandparents were, because they had to take responsibility for cooking their own medicine. But you'll live longer. The complicated nature of WAR, your inability to touch the guts of it, isn't an argument against it. That's just what human advancement looks like in the 21st century. And if you can accept that you can walk into a tube built out of 100 tons of aluminum, fly seven miles off the ground and land safely thousands of miles away, you can accept WAR.
3. WAR IS HONEST.
When the Braves signed B.J. Upton this offseason, a Braves fan who went to FanGraphs would have been heartened: With 13.9 WAR over the past four years, Upton has played like an All-Star, one of baseball's 50 best players. A Braves fan who went to Baseball-Reference would have been heartbroken, seeing that Upton, with just 7.2 WAR over those four years, is actually worse than the average player. So Upton is either an All-Star, or he's below average. FanGraphs is wrong about Upton, or Baseball-Reference is wrong, or they both are.
No such disagreement divided the three sites in the AL MVP race -- all supported Trout by about three wins over Cabrera -- but the point hung over the debate: If the numbers vary from site to site, how can they be reliable? How can a model that gets it wrong be right?
Examples like Upton's, in which WAR tells dramatically different stories about the same player, are Exhibit A in the skeptic's case against WAR. They're not the norm, but because defense and pitcher's luck can be so hard to capture, they're also not rare. Zack Greinke's WAR over the past three seasons is nearly three times as high on FanGraphs as on Baseball-Reference because FanGraphs focuses more on strikeouts and walks and less on his so-so ERAs. Martin Prado (traded this winter for Upton's brother, Justin) was an MVP candidate on FanGraphs (5.9 WAR in 2012) and merely average (2.3) on Baseball Prospectus, mostly because the sites disagreed on his glove.
It's fair to consider WAR less definitive and harder to use in a national telecast or on the back of a baseball card because there are always the questions of "which WAR" and "why?" Among Baseball-Reference, FanGraphs and Baseball Prospectus, the variations employed are both small -- Baseball-Reference alone counts missed cutoff men -- and very big. The systems use three very different defensive metrics. Each uses a different replacement-level baseline. FanGraphs' WAR considers only a pitcher's strikeouts, walks and home runs; the number of runs he gives up doesn't matter. That's a huge philosophical choice, and there are reasons to agree with it (in the short term, those stats are all we can say a pitcher controls) and reasons to disagree with it (like: It feels weird). In a chart published by Baseball-Reference comparing the three systems, there are 43 different sections of dispute. Wrote Sean Forman, the publisher of Baseball-Reference: "There are hundreds of steps to make this calculation and dozens of places where reasonable people can disagree."
But the variations make WAR useful and honest in a way no static statistic can be. Let's start with batting average. FanGraphs lists B.J. Upton's batting average last season as .246. So do Baseball-Reference and Baseball Prospectus. But for all that certainty, batting average answers only one small question and then stops. It doesn't tell us whether Upton's ballpark helped or hurt him, whether he augmented that offensive production with walks and extra-base hits, whether he added overall value with his glove or his legs. WAR does all that. All batting average can tell us is that Upton got hits more frequently than Jose Bautista and less frequently than Ronny Paulino.
You can't stop there unless you're prepared to say Paulino is better than Upton and both are better than Bautista. So even if you favor traditional stats, you add home runs to the equation to get a sense for a player's power, and maybe you add RBIs, walks, stolen bases and defensive percentage. Now you've got more than a stat. You've got a recipe. And because everyone uses recipes when evaluating players, you're on board with the concept of WAR, which just uses more complex ingredients.
I trust the recipes of FanGraphs, Baseball-Reference and Baseball Prospectus because these sites incorporate decades of research, the scope of which I could never match on my own. Their recipes are the result of smart people pushing one another to get smarter and deviate from one another because the sport, like the world, is messy. But the bottom line is that WAR works. In 2012 the correlation between Baseball Prospectus' WAR and team victories was 0.86 (where 1.0 would have meant a perfect correlation). The correlation between batting average and victories was 0.27. Teams with more WAR win more games. Teams with better batting averages don't.
These recipes will get even better because they get smarter with more data. Baseball Prospectus will soon incorporate into its WAR catchers' ability to frame pitches. The numbers next to each player's name on that site will change. Does that mean the numbers we have now are wrong? Of course they're wrong. Everybody is wrong about everything all the time, and WAR leaves room for this doubt. Doubt has driven us toward better answers for millennia, from Socrates' "I only know that I know nothing" to the guys who made billions betting against a seemingly invincible housing market. Don't accept any number that doesn't leave you room for doubt. "Baseball statistics," Bill James once wrote, "are always trying to mislead you."
ASKED LAST YEAR what he knew about WAR, Mike Trout said: "That's a good question. Not a lot." Neither did Miguel Cabrera. "I am not a computer guy," he said. Although they pay attention to their stats, they are in the business of playing, not calculating.
Yet baseball's front offices, the people in charge of $100 million payrolls and all your hope for the 2013 season, side overwhelmingly with data. For team executives, the basic framework of WAR -- measuring players' total performance against a consistent baseline -- is commonplace, used by nearly every front office, according to insiders. The writers who helped guide the creation of WAR over the decades -- including Bill James, Sean Smith and Keith Woolner -- work for teams now. As James told me, the war over WAR has ceased where it matters. "There's a practical necessity for measurements like that in a front office that make it irrelevant whether you like them or you don't."
Whether you do is up to you and ultimately matters only to you. In the larger perspective, the debate is over, and data won. So fight it if you'd like. But at a certain point, the question in any debate against science is: What are you really fighting and why?