Monday, February 20, 2017

Bayes Area

tl;dr: Bayesian Basketball Dashboard here.

One of the themes of my career to this point, from my doctoral research to being a data scientist in industry, to writing this very blog, is the interaction between substantive expertise and quantitative analysis. In some disciplines such as scientific research, these areas two domains are inextricably linked; a scientist with domain expertise proposes a model or hypothesis for how some phenomena works, and then uses data to confirm or reject the hypothesis. In other disciplines, there has been a tension between the two. For example, Nate Silver describes a conflict between traditional political pundits and his form of data journalism.

If you know me, its no surprise that I care deeply about data. I even feel silly writing that sentence. It seems obvious. Ever since I wrote a senior thesis in college where I analyzed auction prices for sulfur dioxide permits, I have loved getting my hands on data and learning from it. But it also seems like an empty sentence; data is everywhere and people use it in countless ways. To say that you care about data 2017 feels like saying a fish cares about water.

What I think others would find a little surprising is my willingness to overlook or go beyond data and trust human expertise. Just because I have numbers stored somewhere doesn't mean I have evidence. Obviously, data can be misleading or biased in someway. But what I believe is that in the absence of good data, people can be (in the right contexts) very good at integrating various pieces of qualitative and quantitative information and forming judgements.