FIELDf/x is going to change everything

As I threatened last Friday, on Saturday I attended Sportvision's PITCHf/x Summit in San Francisco. I really can't say enough about Sportvision's graciousness in so many ways, or about the spirit of discovery that everyone in the room seemed to share.

The Knuckleball blog has a short recap, and Baseball Prospectus' Ben Lindbergh has a looonnnnngggg recap. Which one you read should probably depend on how much time you've got. Oh, and if you want a really abbreviated, real-time account of the event, you can check my Twitter feed beginning at 9am Pacific on Saturday morning. Or you can see what everyone (on Twitter) had to say. So many options these days, and all to serve you, the Dear Reader ...

Anyway, most of my work here is done. It's already become a cliché to suggest that PITCHf/x and (especially) FIELDf/x are "going to change everything," but at least this cliché has the virtue of being true. It may take a few years, if only because the volume of data that will flood the mainframes in 2011 is so massive, it'll take some time to get a handle on the stuff. And nobody knows who's going to have access to what; PITCHf/x has been made available to the masses, but we might not be so lucky with FIELDf/x. But for all the reasons you can read about in the various recaps -- and many more -- baseball analysis is heading for radical changes. And anyone who doesn't get on board will be ... well, left at the station. And losing baseball games.

I just want to mention one of the presentations, because the topic is actually elementary enough for me to sort of wrap my brain around ... Greg Rybarczyk -- the genius behind Hit Tracker -- introduced the idea for a new defensive metric: True Defensive Range (TDR).

Why on earth would anyone want another new defensive metric?

As Greg pointed out, all of our current "new" defensive metrics are "zone-based"; that is, they begin by separating the field into distinct zones, noting in which zone a play has been made, and then apportioning credit (or not) when a fielder makes a play (or doesn't) in that zone. I'm simplifying, of course, but essentially every reputable system now in use shares two significant defects: the "zones" don't fit neatly into today's highly variable outfield dimensions, and the systems don't have any way to account for the fielder's starting position.

What's more, Greg argued that the great majority of plays made by outfielders are essentially irrelevant. If an outfielder doesn't have to move to catch a line drive, that play still counts in his favor. If he can't catch that same fly ball because he was pulled way in with the potential winning run on third base in the ninth inning, that play still counts against him.

What Greg proposes is stripping out the plays that any outfielder could make, and the plays that perhaps no outfielder could make, and look instead at only the plays that might be made. And this might be done given FIELDf/x's ability to record an outfielder's starting position, the hang time of the batted ball, and the baseball's landing point.

Now, this does lead to a question about sample size. It's lately become a truism that we need something like three seasons before modern fielding metrics give us a great read on a player's defensive contribution. We have, I think, come to this conclusion not because of any experimental evidence, but simply because the year-to-year numbers fluctuate so wildly. Essentially, we waited until the numbers didn't make sense and then said, "Whoa, there. Don't take these numbers so seriously. We meant these as just an approximation. Wait a couple more years and get back to us."

Rybarczyk believes that even as we cut our measured plays from a few hundred (per player) to something less than 100 per season, what we lose in sample size we'll gain in sample importance, and that the one-season numbers will give us a more accurate appraisal than we've got now.

I don't have any idea if he's right about that. But I'm fairly sure we're going to find out.