Sunday, March 5, 2017

Measured Action

In the wake of the Trump's election, I have been trying to figure out how to be an effective advocate for the issues I care about and the values of I believe in. I vowed to be more civilly engaged an prioritize acting over assurance the value of action. Without getting into exactly what I have been doing, I have taken to heart what former Obama speechwriter John Farvreau recently described as "protest is the new brunch." 

All that said, I do want to figure out what actions would be the best use of my time. Like any good policy wonk and data nerd, I turned to empirical research to identify which actions can have an impact.  I found* a great paper on this very subject: "Do Political Protests Matter? Evidence from the Tea Party Movement" by Andreas Madestam, Daniel Shoag, Stan Veuger, and David Yanagizawa-Drott, published the Quaterly Journal of Economics.

In this paper, the authors attempt to identify whether showing up at a protest is going to have any political impact. 

Because protests and political outcomes may be correlated through some latent variable, identifying such a causal impact can be difficult. Regions with large protests are likely to have election results consistent with the political leanings of those protests. I live in Oakland and near Berkeley; I see protests all the time. Liberal stalwart Barbara Lee represents my district. That doesn't mean the protests push her to be liberal. It is pretty likely that my district just has many liberal-minded people living in it. Whatever made us liberal and made us decide to live in this district, may be the very thing that causes us to vote for liberal representatives and to show up at protests. Barbara Lee would likely be my representative and would stand up for progressive causes, whether or not there we protests. 

So the question is, does my showing up at a protest have any impact on broader policy? This is hard to figure out. It's not like we can just run an experiment, where we randomly decide which congressional districts have a protest and which don't and then observe what happens.  The authors of this paper identified something almost as good. They realized different districts would have different weather on protest days, and the weather could effect the size of the protest. Because rain in specific location on a specific day is relatively random, they could use the variation caused by weather to measure the impact of protests on politics.

On April 15 2009, the Tea-Party planned and organized large Tax-Day protests, not unlike the Women's Marches on the Saturday following inauguration day. So essentially, the authors were asking "did rain on the day of the protest effect who got elected a year and a half later ?" They also looked at other outcomes, like how representatives voted in congress in years following the protest. The result they got was a pretty strong "Yes." Furthermore, it seems extremely reasonable to believe the only way rain on that specific day would affect outcomes is through the protests.

The authors state:
... we find that the weather-driven exogenous variation in rally attendance on Tax Day 2009 affected the eventual impact of these rallies. Where it did not rain, the number of local Tea Party activists was larger than where it did. Grassroots organizing increased, as did contributions to associated PACs and attendance at subsequent rallies. The population at large adopted the conservative-libertarian views of the protesters, and voter mobilization rose. This then led to more conservative voting both in the 2010 midterm elections and in the U.S. House of Representatives, and encouraged Democrat incumbents to retire.
Specifically, they find  that "for every protester, Republican votes increased by seven to fourteen votes." That's pretty powerful.

They don't believe showing up the protest is sufficient. By measuring attendance at future rallies, donations to conservative groups, and press coverage, they are actually arguing that these protests likely set off a pathway of events which eventually lead to the political outcomes. If you miss a protest, but make a donation you may have the same impact as if you did after showing up at the protest. But, a larger protest makes it more likely you donated in the first place.

While the authors create estimates for the change number of votes (and other outcomes) for each protest, they admit that this requires a strong assumption: the only causal mechanism between weather and the outcomes is attendance. They are pretty clear the only way the would expect weather on a specific day to influence outcomes is through the rally, but other elements about rally can be affected by the rain. Maybe less press shows up, or maybe people have a better time in good weather causing them to be more likely to return to the next one. The authors have no strategy to separate the impact of attendance from press coverage from quality of the experience. 

The authors also do some interesting tests to make sure this is not a spurious correlation.  The most interesting was looking at variation in weather on other days. They look back nearly 20 years, and find all days where at least 10% of districts had rain (representing enough variation to run the experiment). This provided a sample of over 100 days that had no particular political significance to see if rain had an  effect on elections: testing a placebo. Unsurprisingly, the effect on April 15 was often near edge of the distribution, suggesting that it is unlikely that by random chance that this  particular day was significant. However, it also demonstrated that there were a small number other days where rain was more impactful, even though there was no theoretical reason to think it would be. But hey, thats statistics.

Finally, the authors present a therotical model for the impact of  protests. They suggest to mechanisms that protests can influence politicians: persuasion and information. Persuasion is about applying political pressure to move to change their positions. Information is about letting them know how the electorate stands. The authors conclude that if information was the driving mechanism, the effect of the rallys would decrease over time, when the empirical evidence shows the opposite. This suggests that protests can pursuade politicians to change their behavior. 

Overall I found the paper both clever and persuasive that protests matter. Simultaneously, there is no evidence that protests are either sufficient or necessary. If an individual would be politically involved by donating money or in an organization, regardless of weather or not they went to a protest, this does not prove that they should have. Similarly, going to a protest, but not being involved afterwards may not change votes. All this paper proves is that going to a protest is part of a causal path to influencing electoral outcomes. 

So I'll keep on doing all the things. And while its unclear the Women's March will have the same result as the beginning of the Tea-Party, I look forward to reading the research that lets us know

*recommended through the Linear Digressions podcast

Monday, February 20, 2017

Bayes Area

tl;dr: Bayesian Basketball Dashboard here.

One of the themes of my career to this point, from my doctoral research to being a data scientist in industry, to writing this very blog, is the interaction between substantive expertise and quantitative analysis. In some disciplines such as scientific research, these areas two domains are inextricably linked; a scientist with domain expertise proposes a model or hypothesis for how some phenomena works, and then uses data to confirm or reject the hypothesis. In other disciplines, there has been a tension between the two. For example, Nate Silver describes a conflict between traditional political pundits and his form of data journalism.

If you know me, its no surprise that I care deeply about data. I even feel silly writing that sentence. It seems obvious. Ever since I wrote a senior thesis in college where I analyzed auction prices for sulfur dioxide permits, I have loved getting my hands on data and learning from it. But it also seems like an empty sentence; data is everywhere and people use it in countless ways. To say that you care about data 2017 feels like saying a fish cares about water.

What I think others would find a little surprising is my willingness to overlook or go beyond data and trust human expertise. Just because I have numbers stored somewhere doesn't mean I have evidence. Obviously, data can be misleading or biased in someway. But what I believe is that in the absence of good data, people can be (in the right contexts) very good at integrating various pieces of qualitative and quantitative information and forming judgements.

Sunday, July 31, 2016

Research Matters

One of my first blog posts was about the fantastic book Between the World and Me, by Ta-Nehisi Coates. At that time, I said I was looking forward to writing about some academic research about racism and the use of force by police officers. As with many things in the blog, it took a while. In this case, it wasn't for lack of trying. Over the last six months, I found myself returning over and over again to google scholar, but was unable to find any compelling research in this area.

Then, the exact week that officer involved shootings became a major news story again, with two high-profile incidents, rallies across country, and then a shooting against police officers, a relatively high profile piece of economic research came out. A working paper, The Empirical Analysis of Racial Differences in the Police Use of Force by Roland G. Fryer, Jr  was posted on the website of the National Bureau of Economic Research (NBER). The paper examines whether African Americans, and other minority groups experience disproportionate amounts of force, after being stopped or encountered by the police.

Sunday, July 24, 2016

Hadoop... There It Is (Part 2)

Well, at long last, I have completed my Hadoop Raspberry Cluster. It took a couple of months to dive back into this project. I have my own personal cloud, running similar technology that power some of the worlds most important tech companies. However, my cloud is pretty lame. It less powerful than the MacBook Air that I am currently writing this post on. But, at least it's complete and time to write about it!

Saturday, May 21, 2016

Analyze That: Data Journalism and Trump

In the last week or so, I have encountered lot of discussion about the failure of data journalists (mostly the good folks at to predict Trump's nomination to the Republican Ticket. In fact, that's understating it a little bit, they were quite confident that Trump would not be elected - famously Nate Silver put his chances of winning around 2 percent. In a recent podcast and 538 article, Nate Silver did some interesting post-mortem on the analysis. In part, he critiques his own methods and in part chastises himself for issuing a subjective prediction that did not come from a computational model. For this, he states that in this particular instance, he acted like a pundit. He was too focused on his own priors and underestimated the uncertainty due to a small sample size of "Trump-like" candidates. At the same time, he does defend his use of empirical approaches.

Sunday, May 15, 2016

Analyze That

One of the things I often enjoy doing with my friends is thinking through some political, policy, economic, or business problem. Sometimes this an issue in the news, sometimes it's something that one of us recently read about or heard about on a podcast. Other times, it's some random topic that we happened to stumble onto over the course of a conversation. Either way, we generally just have a good time breaking such a problem down. We often jokingly refer to this as "consulting the shit" out of a problem.