Big data will change college hoops, too

In October, Duke made a watershed announcement: It would become the first college basketball program to install STATS LLC’s SportVU cameras in its arena.

For years, NBA franchises have been recording, compiling and analyzing the data that SportVU’s all-seeing wide-angle lenses provide. This summer, the league announced that every NBA arena would install the technology. College basketball has lagged behind for obvious financial and logistical reasons, but Duke’s news made the presumed SportVU trickle-down a matter of if, not when. Sooner or later, big data was coming to college hoops.

Thursday, Grantland’s Kirk Goldsberry showed just how far ahead the NBA really is. But even more important is what Goldsberry’s piece implies about how much anyone who watches and loves the game of basketball -- at any level -- stands to gain.

Goldsberry, a Harvard visiting scholar and geography Ph.D., has been making spatial NBA maps for years. These charts are always insightful, even when they merely reinforce how insanely good LeBron James is. But Goldsberry’s piece Thursday dives much deeper. It begins as anecdotal demonstration -- so here’s why the Spurs are really good -- and ends as a full-fledged state of affairs, with the story of how two Harvard statistics whiz kids managed to divine some wisdom out of the trove of data the SportVU cameras laid at Goldsberry’s lap:

"Early in the spring semester of 2013, Cervone and D’Amour proposed a new project to measure performance value in the NBA. The nature of their idea was relatively simple, but the computation required to pull it off was not. Their core premise was this: Every ‘state’ of a basketball possession has a value ...

It was their belief that, using the troves of SportVU data, we could -- for the first time -- estimate these values for every split second of an entire NBA season. They proposed that if we could build a model that accounts for a few key factors -- like the locations of the players, their individual scoring abilities, who possesses the ball, his on-ball tendencies, and his position on the court -- we could start to quantify performance value in the NBA in a new way. ... Cervone and D’Amour’s central thesis is that no matter where you pause the game, that you could scientifically estimate the ‘expected possession value,’ or EPV, of that possession at that time."

What does “expected possession value” promise, exactly?

If we can estimate the EPV of any moment of any given game, we can start to quantify performance in a more sophisticated way. We can derive the “value” of things like entry passes, dribble drives, and double-teams. We can more accurately quantify which pick-and-roll defenses work best against certain teams and players. By extracting and analyzing the game’s elementary acts, we can isolate which little pieces of basketball strategy are more or less effective, and which players are best at executing them.

General statistical understanding of college basketball is already at its highest level ever. Per-possession data like Ken Pomeroy’s is a key insight, which is why you see us use it so often: Among other things, it grants us the fundamental ability to understand each individual trip down the floor. Dean Oliver’s Four Factors led the way, and helped us color in those lines. Hoop-Math.com highlights stylistic underpinnings; Synergy scouting data creates order from visual madness; the BPI posits a tournament selection middle way.

Tempo-free statistics are the combustion engine to the box score’s horse-drawn carriage. EPV is like discovering a mass relay. We still have glaring holes in our knowledge, places stats don’t see -- how players set screens, whether they block out on the weak side, whether they’re a step slow or a step late on defensive rotations. We know these things share relationships with winning basketball games, but how much? And who does them well? And how can one pair of eyeballs process all of that information? SportVU’s big data will clear these hurdles for us.

And that’s still not the really exciting part.

After all, the vast majority of college basketball arenas aren’t going to be outfitted with SportVU cameras. Most Division I basketball teams have more crucial expenses to worry about. Even the well-heeled portions of the NCAA membership take forever to standardize these types of things, and even then it won’t behoove the schools who do pay for this data to share it publicly. You and I won’t be sifting through SportVU spreadsheet models any time soon.

The exciting part is not the data itself, however. It’s the way the general contextual understanding of that data’s mere existence will change the way we view the game. Smart NBA people will put the information to work; smart college folks will keep up. The knowledge that each slice of a possession has a specific value, and that value fluctuates based on a number of intricately entwined factors, isn’t about XML routines in Excel files. It’s about knowing that all of the little things that go into each and every possession -- and each and every half, and each and every game -- have an impact on that game’s outcome.

It’s about grading the process vs. the result. About knowing which one is more important. The specific numbers are almost beside the point. It’s the idea that matters.

Much like tempo-free information, which Dean Smith recognized 50 years ago, this is a thrilling new spin on an old idea. Good coaches know this stuff already. (Nick Saban has turned it into gospel.) But as more and more fans come to view the game through this lens, the potency of dumb old arguments about one team being better than the other because of one bounce or two, one shot made or missed -- all of those “he’s a winner” tropes carried over from decades of purple sportswriting -- will erode. Smart fans will know better. We’ll be watching the game differently, and more intelligently, because of it. Even if we don’t know why.

That’s the real trickle-down effect SportVU promises. And it begins: now.