Inside the stats that created 'Moneyball'
With "Moneyball" nominated for several Oscars and baseball season just around the corner, everybody's talking stats and sabermetrics. As you probably know, "Moneyball" tells the story of small-market, cost-conscious Oakland A's general manager Billy Beane (played by Brad Pitt). Beane turns to "unorthodox" statistical analysis in order to compete with big-market, big-spending teams by identifying players with high on-base percentages, good baserunning skills and solid defensive abilities.
This isn't exactly what sabermetrics is all about. But "Moneyball" (the movie) did introduce the basic concept of advanced statistical analysis to a mainstream audience. This is not a bad thing. But now that we're talking about advanced stats (short-handed as sabermetrics, thanks to Bill James of the Society for American Baseball Research) and great sites like Baseball Prospectus and FanGraphs, let's take a closer look.
WAR, OPS, wOBA, VORP, BABIP, FIP, UZR. You've probably heard them over the past few years. "Dude, his wOBA is off the charts, but his UZR is terrible." But what do all the terms mean? And how on earth do you pronounce them? I'm going to do my best to take you through many of the most common (and useful) concepts.
I can hear you now: "I was told there'd be no math!" It's true that some of the analytics are really complicated and involve complex formulas. But have no fear, this post is not about the actual math (many fine baseball sabermetricians around the Internet have already done all the math for you). The goal is to give you an easy-to-understand breakdown of the concepts behind the formulas and why they can be helpful, especially as we prepare for fantasy baseball season.
OPS (On-Base Plus Slugging)
Let's start with a pretty straightforward metric: OPS (can be pronounced opps or oh-pee-ess). This is an easy-to-understand offensive metric that provides a huge advance beyond the "basic" stats of RBIs, batting average and home runs. OPS is simply on-base percentage plus slugging percentage (total bases divided by at-bats). It's essentially a way to look at how a player contributes both in terms of getting on base and hitting for power. See, that wasn't too difficult.
To give you some context, take a look at the 2011 major league OPS leaders. It's a who's who of offensive juggernauts from Jose Bautista of the Blue Jays (1.056) to Miguel Cabrera of the Tigers (1.033) to NL MVP Ryan Braun of the Brewers (.994) to Matt Kemp of the Dodgers (.986). We won't get into a complex discussion about what constitutes a good OPS since it does vary by position, but it's safe to say that, regardless of position, an .800 OPS is solid.
wOBA (Weighted On-Base Average)
Your fantasy league probably has batting average as a category. That's not changing anytime soon (and if it is, it would be a move to OBP, not wOBA). What batting average doesn't factor in is the types of hits. wOBA (pronounced "whoa-bah") is another helpful and not-too-tricky offensive metric. The basic principle behind wOBA (as explained by FanGraphs) is straightforward: "not all hits are created equal."
Singles aren't worth as much as home runs, right? So under the wOBA formula (trust me, you don't want to see the formula), each way to reach base (nonintentional walks, hit by pitch, reached on error, singles, doubles, triples and home runs) has been assigned a different weight. A singles hitter could have a high batting average but a lower wOBA than a slugger who hits for power with a much lower batting average.
While there isn't any single offensive stat that tells you everything about a hitter, wOBA factors in batting average, on-base percentage and power in one neat little package. It's a good next step in the advancement of stats beyond OPS. While OPS is a useful, quick-and-dirty number, wOBA takes the extra step by meshing the components of OPS into one number. Remember that it doesn't take into account anything else about the game: runners on base, ballparks, etc.
According to FanGraphs, the league-average wOBA is around .330. Not surprisingly, many of the league leaders in wOBA are also the top offensive players in the game: Bautista (.441), Cabrera (.436), Braun (.433) and Kemp (.419).
FIP: Fielding Independent Pitching
We're all comfortable with ERA as a basic pitching statistic. But ERA gives us only the average of earned runs per nine innings. It's simple and straightforward. Low ERA is good. Simple. But what if there was a way to factor out all the things that the pitcher can't control?
Turns out some really smart guys named Tom Tango and Voros McCracken wanted to figure that out. So they devised FIP, a formula that includes the things pitchers can control -- home runs, walks, hit by pitch and strikeouts -- and eliminates everything else (hits, errors, quality of fielders, etc.). FIP does what it says: It looks at pitching independent of fielding and other variables that impact a pitcher's performance.
What does this mean for you? FIP is an excellent way to predict a pitcher's future performance. But keep in mind that it's not a stand-alone data point that will help reveal the secret to your pitching staff's success in fantasy; it needs to work in conjunction with other tools in your stat toolbox.
A note on xFIP (Expected Fielding Independent Pitching, pronounced "ex-fip"). Many sabermetricians have added xFIP to their pitching prediction arsenal. FIP and xFIP use the same formula, but as FanGraphs explains, xFIP "replaces a pitcher's home run rate with the league-average rate (10.6% HR/FB) since pitcher home run rates have been shown to be very unstable over time"; xFIP attempts to normalize home run rates.
Let's take a look at some notable pitchers to see how their ERA, FIP and xFIP looked in 2011 (according to FanGraphs). Roy Halladay (Phillies): 2.35 ERA, 2.20 FIP, 2.71 xFIP. Clayton Kershaw (Dodgers): 2.28 ERA, 2.47 FIP, 2.84 xFIP. Justin Verlander (Tigers): 2.40 ERA, 2.99 FIP, 3.12 xFIP. All three had great years, but you'll see that except for Halladay, all of them had higher FIP/xFIP than ERA. Does that mean that a regression is in order? Probably not. But you should look for pitchers with a higher FIP/ERA differential because that's where the pitching values can be found. An example would be Toronto starter Brandon Morrow. His ERA was a not-great 4.72 but his FIP was a respectable 3.64. The 1.08 ranked as the third-highest FIP/ERA differential in the majors. It's not a guarantee, but he's someone to keep an eye on. After all, with both FIP and xFIP in your arsenal, you're all set to predict the pitching future. Sort of.
UZR: Ultimate Zone Rating
How many times have you been watching a game and the shortstop (I'm not naming names) can't quite get to that grounder between short and second base. "Dammit," you mutter under your breath. "That will cost us a run!" Now we have a stat that actually confirms (sort of) whether or not the unnamed shortstop's failure cost your team a run.
UZR is our first defensive metric of the day. I'm going to keep this one short and sweet. Why? Because it's so complicated that there's no way to easily explain how UZR is calculated. For years, baseball statheads were forced to rely on fielding percentage to gauge a player's defense. UZR is the sabermetric community's effort (created by statistician Mitchell Lichtman, probably best-known for his work on "The Book") to quantify how good a fielder is at getting to balls in his defensive zone compared with the league average. How many runs did a player give up or save based on his ability to cover his zone as opposed to simply making the plays hit right at him? Simple, right?
A quick note on fielding percentage (FPCT). It falls short compared to UZR because a player's error total may be kept low simply because he doesn't even get to a large number of balls for the chance to fail on converting the play. Put another way, would you rather your shortstop not get to a ball between short and second, forcing the center fielder to take it and letting the runner on second score? Or would you rather he have the range to get to it, but perhaps flub the toss to the second baseman for the force resulting in an error, but also freezing that second baserunner on third?
A UZR between 3 and 5 is solid, 5-8 is great and 8+ would be considered elite. Of course, there are guys who get to everything (like Yankees left fielder Brett Gardner). According to FanGraphs, Gardner has saved 36.8 runs per 150 games. Evan Longoria, widely regarded as one of the best third basemen in the game, saves 15 runs per 150 games. On the other end of the spectrum, Derek Jeter costs the Yankees 1.8 runs per 150 games and Alex Rios costs the White Sox 2.4 runs per 150 games.
VORP: Value Over Replacement Player
VORP is one of my favorites. Baseball Prospectus, which created VORP, defines it as: "The number of runs contributed beyond what a replacement-level player at the same position would contribute if given the same percentage of team plate appearances. VORP scores do not consider the quality of a player's defense." It is far too complicated for baseball enthusiasts without baseball-related jobs to calculate, but it's the end result that matters.
The bottom line is that VORP provides a quantifiable, measurable number of a player's value in runs. How much more is Kemp worth compared to a "replacement player"? It might be helpful to understand what makes a "replacement player." Details, details. According to Keith Woolner, who invented VORP, a "replacement player" is "roughly 80 percent as good as an average major league hitter in that position."
Think of a replacement player as the run-of-the-mill guy on your team's Triple-A affiliate who will bounce up and down throughout the season to fill in.
Let's apply this concept to real life. The question is, how many more runs did Kemp deliver to the Dodgers versus another player roughly 80 percent as good as an average major league hitter in center field -- say, Trent Oeltjen or Eugenio Velez from Triple-A Albuquerque? The answer, according to BP, is 95.2. That is a remarkable number (MVP Braun's VORP was just 59.7). Bautista's VORP was an impressive 86.
And here's something fun -- it is possible for a major league player to have a negative VORP. This means that some players could actually be worse than a not-very-good replacement at that same position. Alex Rios of the White Sox created 5.4 fewer runs than a replacement-level right fielder in 2011. During the season, the White Sox chose to play Alejandro De Aza, called up from Triple-A Charlotte, for a spell over Rios. He posted an 18.4 VORP. White Sox fans know that Rios is terrible, but just how much does he detract from the team's production? Now you know.
The real drawback with VORP is that it doesn't take into account defense, so it shows only a player's value in terms of offense. Which is why we also look at our next key metric.
WAR: Wins Above Replacement
WAR isn't perfect (no I'm not making a political statement here), but wins above replacement is about as close to a "What does this player really mean to my team?" catchall valuation as we're going to get. Its definition is straightforward: How many more wins does a player add above a replacement-level player?
What makes WAR so tempting is it combines a bunch of metrics from almost every category and rolls them into one nice little number. It also works for position players and pitchers. You can find WAR valuations at FanGraphs (fWAR) and Baseball-Reference (rWAR). No matter where you find them, you'll have a basic numerical expression of the players' value in terms of wins.
For reference purposes, let's look at Baseball-Reference's key to WAR: 8+ WAR is an MVP candidate, 5+ WAR is All-Star Level, 2+ WAR is a solid starter, 0-2 WAR is a bench player (a 24th/25th man on the roster), while anything below 0 is replacement level.
According to FanGraphs, Jacoby Ellsbury led the majors with an otherworldly WAR of 9.4 in 2011. Kemp followed at 8.7, with Bautista behind him at 8.3 and Braun at 7.8. On the other end of the spectrum, Raul Ibanez registered a minus-1.3 WAR (probably one reason why he's looking for work right now). On the mound, Halladay led all pitchers with an 8.2 WAR. Verlander had an impressive 7.0 WAR and NL Cy Young winner Clayton Kershaw was 6.8.
BABIP: Batting Average on Balls In Play
Like many sabermetric measurements, BABIP has been around for more than 10 years (it was introduced in 2001 by Voros McCracken), but it had its coming-out party during the 2011 season. You couldn't see a discussion of a hitter or pitcher without someone throwing out, "Yeah, but he's been BABIP'd to death" or "He's been too lucky. Check out his BABIP." BABIP is a way to determine if a hitter or pitcher has good or bad luck. According to those who really know these things (like ESPN's Tristan Cockcroft in this BABIP primer), there are far too many variables at play to make it a really reliable stat for hitters. It's an extremely valuable tool to gauge a pitcher's luck and whether or not you might be getting a steal in your fantasy draft.
The average batting average on a ball in play (a ball in play includes singles, doubles, triples, sacrifice flies and all other outs but excludes strikeouts, home runs, walks and foul-outs) is around .300. The basic idea is that a pitcher doesn't have control over what happens to a ball once it's put in play. If you have a higher BABIP, you're likely "unlucky." If your BABIP is lower than .300, you're considered lucky. You can use BABIP as you evaluate ERA and WHIP. A high BABIP means there is potential for a pitcher to improve (by getting "luckier") and thus reduce his ERA and WHIP. A low BABIP? Regression might be in the offing. Remember when we discussed UZR? An infield littered with guys who hold poor UZRs can lead to a higher-than-average BABIP for a pitcher as his fielders fail to turn playable balls into outs. Conversely, a pitcher with a quality infield can produce a lower-than-normal BABIP and actually sustain it if the handiwork of his fielders is the primary cause.
Let's look at a few notable pitchers from 2011 to see how lucky (or unlucky they were). According to FanGraphs, Verlander's BABIP was an eye-poppingly low .236, while CC Sabathia's was .318. Kershaw's was .269 and Ian Kennedy's .270. Were they luckier than they were good? Maybe a little of both?