Monday, March 28, 2011

Issues with Crowdsourced Data Part 2

A recent guest Beneblog explains why we believe a correlation found between SMS text messages and building damage by researchers was not useful. Some of the questions we received made us realize we need to be clearer about why this is important. Why did we bother analyzing this claim? Why does it matter? Thanks to Patrick Ball, Jeff Klingner and Kristian Lum for contributing this material (and making it much clearer).

We’re reacting to the following claim: “Data collected using unbounded crowdsourcing (non-representative sampling) largely in the form of SMS from the disaster affected population in Port-au-Prince can predict, with surprisingly high accuracy and statistical significance, the location and extent of structural damage post-earthquake.”

While this claim is technically correct, it misses the point. If decision makers simply had a map, they could have made better decisions more quickly, more accurately, and with less complication than if they had tried to use crowdsourcing. Our concern is that if in the future decision makers depend on crowdsourcing, bad decisions are likely to result -- decisions that impact lives. So, we’re speaking up.

In the comments to our last post, Jon from Ushahidi said "If a tool's fitness cannot be absolute, then neither can it's fallibility." And, that the correlation they found was useful. Why is this something worth arguing about?

Misunderstanding relationships in data is a problem because it can lead to choosing less effective, more expensive data instead of choosing obvious, more accurate starting points. The correlation found in Haiti is an example of a "confounding factor". A correlation was found between building damage and SMS streams, but only because both were correlated with the simple existence of buildings. Thus the correlation between the SMS feed and the building damage is an artifact or spurious correlation. Here are two other examples of confounding effects.

- Children's reading skill is strongly correlated with their shoe size -- because older kids have bigger feet and tend to read better. You wouldn't measure all the shoes in a classroom to evaluate the kids' reading ability.

- Locations with high rates of drowning deaths are correlated with locations with high rates of ice cream sales because people tend to eat ice cream and swim when they're at leisure in hot places with water, like swimming pools and seasides. If we care about preventing drowning deaths, we don't set up a system to monitor ice cream vendors.

We're particularly concerned because we think that using a SMS stream to measure a pattern is probably at its best in a disaster situation. When there's a catastrophe, people often pull together and help each other. If an SMS stream was ever going to work as a pattern measure, it was going to be in a context like this -- and it didn't work very well. We don't think that SMS was a very good measure of building damage, relative to the obvious alternative of using a map of building locations.

The problems will be much worse if SMS streams are used to try to measure public violence. In these contexts, the perpetrators will be actively trying to suppress reporting, and so the SMS streams will not just measure where the cell phones are, they'll measure where the cell phones that perpetrators can't suppress are. We'll have many more "false negative" zones where there seems to be no violence, but there's simply no SMS traffic. And we'll have dense, highly duplicated reports of visible events where there are many observers and little attempt to suppress texting.

In the measurement of building damage in Port-au-Prince, there were several zones where there was lots of damage but few or no SMS messages ("false negatives"). This occurred when no one was trying to stop people from texting. The data will be far more misleading when the phenomenon being measured is violence.

As we've said in each post, crowdsourcing generally and SMS traffic in particular is great for documenting specific requests for help. Our critique is that it's not a good way to generate a valid basis for understanding patterns.

Thursday, March 17, 2011

Crowdsourced data is not a substitute for real statistics

Guest Beneblog by Patrick Ball, Jeff Klingner, and Kristian Lum

After the earthquake in Haiti, Ushahidi organized a centralized text messaging system to allow people to inform others about people trapped under damaged buildings and other humanitarian crises. This system was extremely effective at communicating specific needs in a timely way that required very little additional infrastructure. We think that this is important and valuable. However, we worry that crowdsourced data are not a good data source for doing statistics or finding patterns.

An analysis team from European Commission's Joint Research Center analyzed the text messages gathered through Ushahidi together with data on damaged buildings collected by the World Bank and the UN from satellite images. Then they used spatial statistical techniques to show that the pattern of aggregated text messages predicted where the damaged buildings were concentrated.

Ushahidi member Patrick Meier interpreted the JRC results as suggesting that "unbounded crowdsourcing (non-representative sampling) largely in the form of SMS from the disaster affected population in Port-au-Prince can predict, with surprisingly high accuracy and statistical significance, the location and extent of structural damage post-earthquake."

One problem with this conclusion is that there are important areas of building damage where very few text messages were recorded, such as the neighborhood of Saint Antoine, east of the National Palace. But even the overall statistical correlation of text messages and building damage is not useful, because the text messages are really just reflecting the underlying building density.

Benetech statistical consultant Dr. Kristian Lum has analyzed data from the same sources that the JRC team used. She found that after controlling for the prior existence of buildings in a particular location, the text message stream adds little to no useful information to the prediction of patterns of damaged building locations. This is not surprising, as most of the text messages in this data set were requests for food, water, or medical help, rather than reports of damage.

In fact, once you control for the presence of any buildings (damaged or undamaged), the text message stream seems to have a weak negative correlation with the presence of damaged buildings. That is, the presence of text messages suggests there are fewer (not more) damaged buildings in a particular area. It may be that people move away from damaged buildings (perhaps to places where humanitarian assistance is being given) before texting.

Here's the bottom line: if you have a map of buildings from before the earthquake, you already know more about the likely location of damaged buildings than if you relied on an SMS stream, based on the Haiti data presented. That is, to find the most damaged buildings, you should go to where there are the most buildings! The text message stream doesn't help the decision process. Indeed, it would seem to be slightly more likely to lead you to areas that have fewer damaged buildings. Crowd-sourcing has many valuable uses in a crisis, but identifying spatial patterns of damaged buildings isn't one of them.

Sunday, March 13, 2011

Exciting open access project in Vancouver!

I just spent a couple of days in Vancouver with the team at the Public Knowledge Project, a terrific example of an open source social enterprise. Their largest project is Open Journal Systems, software for running a scholarly journal. It takes an editor through the entire process of operating and publishing a journal, with a heavy emphasis on open access journals (where the articles are freely available to everybody from the moment they are published).

Amazingly enough, more than 8500 journals are published with OJS, with institutions mainly running their own servers. An exciting development is the recent offering of hosting services (through the help of PKP's main partner university, Simon Fraser U. of Vancouver), so that a new journal can be launched without even needing its own home server. A major set of OJS's users are from the developing world: the tools really put the power of expanding knowledge in the hands of scholars!

One metric that made a real impression: OJS users publish an open access article for around $200 per article, which compares to the typical number of $3000 that is frequently floated around in the open access field.

I had the privilege of getting in-depth demonstrations of OJS and some of the other open source software built by PKP. It was great to see another social enterprise successfully meeting the needs of a community!