Last spring, a reporter asked then-Arizona Diamondbacks general manager Dave Stewart what he thought of a baseball projection system called PECOTA, which had just forecast his team to win just 78 games. "They think we only win 78 games? That's a joke," Stewart said.
It's easy to mock Stewart now -- his Diamondbacks actually won 69 -- but the truth is that the GM, without time to prepare for the question, might have had little actual insight into the projection he had been asked about. He heard what sounded like a mean opinion, and he had a better opinion, so he said it out loud. His only mistakes were having an opinion and sharing it. And trading for Shelby Miller.
Tuesday, the baseball think tank Baseball Prospectus released its PECOTA projections for 2017, including projected team records. Many of us will look at them with curiosity, and we will also have opinions. The Orioles, PECOTA says, will win 72 games; you have opinions about that! Same with the Cardinals' projected 77 wins. The Cubs will win a relatively dowdy 90, the Royals a disastrous 71 and last season's cellar-dwelling Rays will win 84 games? I have opinions!
But Dave Stewart got fired, friends. We should aspire for more than opinions, especially because PECOTA is itself not an opinion. It follows a fairly simple and intuitive logical process to help us evaluate 30 complex collections of 25-plus moving parts. The future it describes sometimes turns out to be "a joke," and you and I are free -- even obliged -- to treat it skeptically. But to have anything of value to offer as a rebuttal, we must do better than Dave Stewart.
So, then, here's a consumer's guide to reading and reacting to PECOTA.
What's a PECOTA, and how can I be outraged by it?
Player Empirical Comparison and Optimization Test Algorithm, or PECOTA, is the projection system created by Nate Silver for Baseball Prospectus in the early 2000s. It has been modified and tweaked throughout the years but diligently forecasts individual and team performances in advance of each baseball season. It exists alongside other excellent forecasting systems, including the ZiPS projections created by ESPN's Dan Szymborski.
(I was Baseball Prospectus' editor-in-chief for two years. I had no role in creating or maintaining PECOTA, but I was often called on to publicly explain, say, its Diamondbacks projection.)
You can be outraged by it. Maybe you think it contains spoilers. Maybe you think baseball is too unprojectable to bother. Maybe you are mad about some other projection Nate Silver has made since he retired from PECOTA. I set no limits on our capacity to be annoyed. But to cut to the chase: The Orioles projection might annoy you.
The Orioles were one of last year's AL wild-card teams, after PECOTA projected them to win only 74 games. They are they winningest team in the American League over the past five years, outperforming their projections by an average of 14 games a year. They have earned the right to roll their eyes.
But PECOTA knows all of this -- the 89 wins, the five years of excellence -- and it still sees something wrong with this club. To understand why, it helps to understand how.
Why does PECOTA hate the Orioles?
The P in PECOTA, you'll recall, stands for Player; the T, you'll recall, does not stand for Team. PECOTA only knows how to project player performances. It knows that Chris Davis exists, that Kevin Gausman and Joe Gunkel and Aneury Tavarez exist, but it does not know that the Orioles exist. It creates projections for each player, then it stops.
From there, BP applies some human hands and some basic understandings about baseball to turn individual projections into team projections. The basic understandings are: Wins are built out of runs, and a team's record will reflect its ability to outscore its opponents; and runs are built out of rallies and homers, so a team's runs scored and allowed will reflect its ability to rap hits, knock dingers, swipe bags, and do other cool baseball verbs.
To estimate how many runs a team will score and allow, BP needs to know who is going to be on the field and how often. So human hands -- BP staff hands -- assign each player to a depth chart. Davis will bat more often than Tavarez and Gausman will pitch more often than Gunkel, and estimating exactly how much more creates an estimate for how many hits will lead to how many runs will lead to how many wins. That's the team projection you see.
But clearly the Orioles didn't get 17 wins worse this winter. Where does PECOTA get this insanity?
I will describe all of this dispassionately. Maybe we'll argue about it in a minute, but for now I'm just relaying the differences that PECOTA identified between last year and this year. We're cool and calm, you and I.
For starters, PECOTA didn't think the Orioles won 89 games last year. I know, I know, they did, but PECOTA has only been taught to see performances -- hits, walks, that sort of thing -- rather than wins and losses. The Orioles won 89 games last year, but their runs scored and runs allowed were more consistent with the performance of an 84-win team. (This is what's known as a Pythagorean record.) PECOTA's more detailed assessment of their offense and pitching -- the Orioles' ability to get hits and stop hits, basically -- was more consistent with a team that would typically win 85 games. (BP calls this their third-order record.) So PECOTA isn't projecting a 17-win drop, but only a 13-win drop.
Still, though: That's a lot. Maybe it's too much, but, importantly, we can actually look and see where it's coming from. We just have to compare last year's individual performances with this year's individual projections:
Manny Machado drops from 5.7 wins above replacement to 4.7 wins
J.J. Hardy drops from 1.8 wins to 0.2 wins
Davis, Jonathan Schoop, Hyun Soo Kim and Adam Jones all hold steady from last year. And in right field and DH, Seth Smith and Mark Trumbo more or less recreate last year's combined performance of Pedro Alvarez and Trumbo, with a slight improvement.
Sure enough, last year's Orioles scored 744 runs, and this year's Orioles project to score 712. That's not a big difference: 30 fewer runs would mean about three fewer wins. You and PECOTA probably almost agree on what the Orioles' offense will do.
But on the pitching side, PECOTA sees this year's Orioles allowing 103 runs more, which would cost about 10 wins. Chris Tillman, Dylan Bundy and Gausman project to be about five wins worse by themselves, with Gausman's ERA ballooning to 4.34. Zach Britton, Brad Brach and Mychal Givens all project to be good, but worse, in the bullpen, adding up to another four wins lost.
The truth is that none of this is that complicated. About 95 percent of the work of these projections is taking last year's performance, and the year's before it, and the year's before that, and maybe some more years before those, weighting them so the most recent performance is the most powerful, and then presuming that the player is who he has been. The other five percent is the details on the edges: How much did the player's home ballpark or quality of competition skew his stats? How old is he, and what sort of aging curve will he follow? Which stats are most predictive? How much will he play?
So Machado loses a win from his projection. But most of that is based on the depth chart: Last year he was fully healthy and batted 696 times. The year before that, he led the league with 713 plate appearances. The year before that, he missed half the season. There are versions of the 2017 season in which he plays 162 games, bats 700 times, and outperforms his PECOTA projection. But there are others in which he tweaks a hamstring and misses two weeks, and others still in which he gets hit on the wrist by a pitch and misses four months. PECOTA's aiming for an average, so it sucks in some of the chance of disaster, waters it down and emerges with a slightly conservative estimate.
Otherwise, PECOTA says Machado is basically going to be the same guy he's been. He's projected to have a True Average -- that's BP's all-encompassing hitting stat -- of .280, which is exactly what his career True Average is. PECOTA is in many ways the world's most boring pundit: You ask what's going to happen, and it basically just tells you what has already happened, with a bit of context and some boring explanation of aging curves and run environments.
But the Gausman part?
Last year, we ran a little game over at Baseball Prospectus. We wanted to know whether a crowd of smart people could spot the bad projections in a bunch. A few hundred people went through all of PECOTA's individual player projections and, when they saw one that looked too pessimistic, they clicked "over," and when they saw one that looked too optimistic they clicked "under."
The results were interesting. There were a handful of players for whom a very clear consensus emerged: Forty-five of 46 people picked the over on Machado, for instance, and he sure enough beat his projection last year. Thirty-nine of 41 people who picked J.D. Martinez picked the over, and he beat his projection. This trend -- overwhelming crowd opinion beating the PECOTA projection -- continued throughout the 20 players who had the most lopsided consensus around them.
(Not a 100 percent chance, though. All 66 people who chose Bryce Harper for the game chose the over, making him the most lopsided piece in the game. Harper undershot his projection.)
I sort of think that the Gausman projection is too low. There's a whole article somebody could write on why, but I'll just leave it at that: I looked at it for a while, and it still looked crazy. Do you agree? Do you all agree? Then I'm comfortable saying that one area the Orioles are better than a 72-win team is in the Gausman projection.
So where are we on this? Can I hate these projections?
By all means! I don't, but you can, and nobody will make you use them. But a good way of thinking about the projections is by keeping in mind those big, broad things you and they probably agree on. Here are what I'd consider the three foundational beliefs of a team projection system:
The past is a good guide for a player's future
A team is a collection of players
Good teams usually score a lot of runs and don't allow many
Those all seem intuitive and uncontroversial, and if you agree with them you're 95 percent of the way to where PECOTA ends up. All that's left is the details, which leave plenty of room to disagree. Dig in to them -- like Jayson Stark did expertly to explore last year's Big PECOTA Controversy, the Royals -- and you'll find the blind spots that make humans irreplaceable.
Dave Stewart is really, really smart, after all. He has knowledge about baseball PECOTA will never have. He just needed to use it.