Following Up with Haralabos Voulgaris

Wednesday's conversation with gambler Haralabos Voulgaris prompted a lot of interesting comments and e-mails.

An important one came from Voulgaris himself. He posted a comment on the original post that distinguishes his strong feelings about Tim Donaghy's actions from his more ambivalent thoughts about claims that the NBA fixed games.

I think that Donaghy has very little credibility, this is a guy who has said all along that him wagering on the games didn't effect how he called the games. I am fairly certain that this is false. Henry asked me my opinion on the latest news last night and I responded with my thoughts. I have never really put too much thought into whether or not the NBA as a league fixes games. I have, however, researched the Donaghy scandal quite extensively and am certain that Donaghy did fix games. I'd much rather focus on Donaghy than on the Lakers vs. Kings series.)

I followed up with Voulgaris, based on the conversation he has inspired. He talked at length about his database of NBA games (even providing a screen capture), how he uses statistics to beat the oddsmakers, and much more. Our exchange follows:
Perhaps because of the way I wrote the article, I may have placed undue emphasis on your statistical database. In turn, some have doubted the claims of your database, and/or the number of games you watch etc. What can you say to reassure those who suspect you may be inflating your dedication to your craft?

Regarding my database; I am not sure how we can convince people regarding it, and truthfully I am not sure I really care!

If people are wondering what type of database it is and I'd be happy to explain it a bit more.

Basically, it's five years of every single play that has occurred in every NBA game. I didn't invent the collection of data. All the data that I have is out there for anyone that wants to do the same. It took me about nine months and a few hundred thousand to perfect the collection and organization of the data such a way that it is valuable for predictive purposes. I use a program called "Stata" to help with analysis of the data, but you could also use a freeware program called "R" to do the same.

In addition, I'd like to add that I employ two very skilled programmers. They wrote much of the code that allows me to collect and organize the data in such a way that it's much easier to analyze. I credit them for the collection of the data.

People are questioning whether or not I am overstating how much data I have, as I mentioned anyone willing to spend the time or the money would be able to get the same data. The issue isn't the data, its what you do with it when you get it.

The play-by-play data (screenshot from the system) has the following:

  • The time of the possession.

  • The player who initiated the possession (in the case of a steal or defensive rebound).

  • The opposing player who initiated the possession (in case of a missed shot or turnover) -- including the location on the floor the shot was taken from, and some other unique identifiers we use to classify the the type of possession.

In the basketball analytic world (stat geeks) people talk of Offensive and Defensive Efficiency (points per possession). I take it a step further and break down the efficiency of different types of possessions. Possessions off of turnovers, made baskets, missed shots (broken down a step further types of missed shots), etc.

As I mentioned, once you have the data you can begin to do all sorts of cool things with it.

The NBA has an interesting application on their site called "Hot Zones" that shows you how a player shoots from a certain spot on the floor. I can give you the same information but I can break it down with much more detail.

Take Kobe Bryant for example, we have every shot he (and every other player in the league) has taken in the last five seasons.

How does Kobe Bryant's shooting chart look when Bruce Bowen is on the floor (guarding him). How does it look with Shane Battier? (I know, small sample size, but you get the idea).

How does it look with X lineup for the Lakers, or how does it look with just Gasol on the floor, and on and on.

Bruce Bowen of the Spurs is an extreme example of how this type of data can help a team. This is a guy that is basically of negative offensive value at every single spot on the floor with the exception of the corner 3-point shot. There are other players that exhibit similar tendencies albeit less extreme. If you were doubling Duncan, your first instinct may be to double off of Bowen, but not if it meant he was open for a corner 3.

My numbers tell me that the corner 3-point shot is the second most efficient shot in basketball next to the layup. Why do some teams spray 3-pointers all over the floor?

Again, the Spurs seem to understand this.

Over the course of a season or even several seasons, teams exhibit a certain offensive profile in terms of where they like to get their offense. What opposing teams are best at stopping a team from getting their preferred offense? In an 82-game season the sample sizes obviously become an issue, but patterns still emerge and the data is extremely useful.

Someone questioned whether or not I really watch every game; Here's a screenshot of myHaralabos Voulgaris screenshot desktop computer. (Bigger one.) I have almost every game from the midway point of the 2006-2007 until the current season. I guess you could argue that I go through the trouble of archiving every game but fail to watch them, not really sure how to convince someone otherwise.

There was a funny segment where the World Poker Tour crew came to my house to film some footage for a tournament broadcast. They used a quote of me telling my girlfriend that Thursday is date night because there are only two NBA games. I think that is basically the best proof of all that I watch far too much NBA. Here I have this fantastically beautiful girlfriend, and she gets one night a week. If that isn't dedication, I don't know what is.

Can you talk a bit about how your system predicts who will be guarding whom?
It's pretty simple, but its also proprietary so I'd rather not give out too much information on it. Basically we have a script that goes through the play-by-play looking for certain players and instituting a decision tree as to what position they are likely to be playing on offense and who they are likely to guard on the other team.

Some of the things it looks for are best offensive players at the 1,2,3 position and best defensive players at the corresponding positions. Bruce Bowen and Shane Battier are a few examples of players that we assign a "stopper" value and assume that if they are on the court they are likely to be guarding the other teams best offensive player at the corresponding position(s). Its not exact but its fairly accurate. I am also able to go through a game after its over and correct the defensive and offensive assignments as needed, ideally I'd like the program to learn from the corrections but we are not at that point yet.

Similarly, there are those who have questioned your rate of success. Any way to assuage them? (One crazy idea: can we set up some kind of test of your predictive abilities?)
As far as those who question my
rate of success, I am not really interested in proving anything to anyone. I'd be happy to take part in your Stat Geek Smackdown next year.

But aside from that, I probably wouldn't be interested in much.

Do you ever talk shop with other basketball stat experts?
I don't really talk shop with other stat experts, to be honest I don't really know of any. There are a bunch of people who post on the Sonics Central APBR board whose opinion I respect. But I am kind of of the opinion that anyone devoting all this time to the study of basketball and not willing to really profit from it, is fooling themselves in some way.

I remember reading a post on the board where a poster stated that he could predict games with 60% accuracy against the vegas line, but he was not interested in doing so. If this guy could win even at a 56% clip over a significant sample, he could make a few million in a few years, and probably retire in five years. I tend to question the sanity or honesty of a person making such a claim.

As I mentioned, there are a few people who's work and opinion I respect, I have even reached out to a few of them with job offers, but I haven't really found anyone who I'd either be interested in hiring, or who'd be interested in working for me. I'd also add that anyone who feels that they can beat the sport from a predictive angle is also willing to come work for me and make a few million.

I probably have a different outlook than most. One of the things I never mentioned in our first Q and A was that I grew up in a home where gambling was prevalent. I spent most of my life at the horse race track with my father, who also bet quite a bit on sports. When you asked me how one goes about becoming a professional gambler, I failed to mention that my father was betting on sports throughout my childhood. Prior to making my one large bet I had already been betting for about 3-5 years. I started out with an average bet size of around $30 bucks.

I have been around sports betting and card playing for quite some time, so I definitely look at things a different way. Most people invest in mutual funds or the stock market, so when I hear someone say that they could make money betting on sports, but choose not to, to me it's like someone saying they could beat the stock market but choose not too.

If you could get an NBA job, what would you say to David Stern to convince him that you won't lapse into your ex-career from time to time?
If I got an NBA job I'd be happy to submit to a yearly polygraph and answer any and all questions regarding whether or not I still gambled.

You use the word lapse as if sports betting is an addiction, I spent four-and-a-half months living at the Bellagio in Las Vegas and only played poker.

Gambling is of no interest to me, I "gamble" on two things: poker and sports. Neither of which is actually a gamble, because my edge in both is rather large.