Thursday, November 03, 2011

One very long weekend in New York City for Megan Price

Guest Beneblog by Megan Price

New York City has many attractions – people often visit Times Square, the Statue of Liberty, Central Park, among many other sights. Me? I go to New York City to spend the weekend staring at my computer screen.

Data Without Border’s kickoff Data Dive is what tempted me across the country, and after a much longer than expected day of travel I found myself surrounded by fellow nerds (data scientists, as this particular group prefers to be called). The group included statisticians, epidemiologists, computer scientists, engineers, political scientists, journalists, and ‘data wranglers.’ We were all there thanks to the efforts of Drew Conway, Jake Porway, and Craig Barowsky (Data without Borders’s founders) who had the crazy idea of bringing together well-intentioned data analysts and non-profits with data in need of analysis.

This particular weekend we divided into teams and tackled projects from the New York chapter of the American Civil Liberties Union (NYCLU), MiX Market, and UN Global Pulse. I joined the NYCLU team, where we worked on data collected by the New York Police Department about their “stop and frisk” practices. “Stop and frisk” is the common name used to reference police stopping a pedestrian. Not all such stops actually result in a search or arrest.

Sara LaPlante, NYCLU’s data and policy analyst laid out two clear goals for us. First, provide data visualizations to help average New Yorkers contextualize their own personal experiences. For example, to answer questions such as in which precinct do the most pedestrian stops occur? On what days and at what time of day do the most pedestrian stops occur? The second goal was a much tougher question – is there a racial bias in pedestrian stops? Researchers have been tackling this question, specifically in NYC, for well over a decade. Most recently, Andrew Gelman, Jeffrey Fagan, and Alex Kiss published a rather complex analysis of similar data in the Journal of the American Statistical Association.

We were not able to make much significant progress on this second goal in a mere 48 hours. But we were able to provide NYCLU with some useful data visualizations, a dataset ready for analysis (typos corrected, locations translated to latitude and longitude for mapping, etc.), and some good ideas for next steps. The most challenging next step is acquiring disaggregated crime statistics for NYC, something we were surprised and frustrated was not readily available online.

Several of us plan to remain involved in NYCLU’s proposed analyses and look forward to staying in contact with Sara and the other members of the team. Descriptions of our project, plus the projects with MiX and Global Pulse, can be found on the Data Without Borders wiki.

The next Data Dive will be held here in San Francisco on Nov. 4-6 (tomorrow!), and I’ll switch from my statistician hat to my non-profit hat for that one – the human rights team is looking forward to supplying some of our own data and recruiting analytical assistance.

No comments: