Data Journalism Readings: Wk 4

Six Provocations for Big Data

Although Dana Boyd and Kate Crawford have tried to define what exactly ‘Big Data is in their paper, it still remains a nebulous concept to me. How is ‘Big Data’ different from regular data? Lev Manovich correctly observes that Big Data used to be data so physically large and complex that they required supercomputers to deal with. I think a better term to deal with the kind of data we are talking about here should be ‘Accessible Data’ because that’s really what the difference is. The data is out there, ready to be accessed through the large variety of technological tools we have at our disposal. We have more access to all kinds of data than we’ve ever had before, which is why it’s easier for us to challenge how it was collected and how it was interpreted.
The paper gives us six points to ponder on now that we have all this access at our disposal. Boyd and Crawford argue that the nature of accessing knowledge itself is changing now that we live in a world where all knowledge is quantifiable. The authors make it a point to argue that we should never just expect mounds and mounds of data to prove our hypotheses for us and it should never be a case of just letting the numbers speak for themselves. It is important to note that all data isn’t readily quantifiable and we should be flexible enough to keep adapting to it. A similar point is made under ‘Claims to Objectivity and Accuracy are Misleading’. Numbers on their own mean nothing, or rather can mean anything. It is up to us to inspect not just the data, but the methodology and the interpretation, while also making our own efforts transparent and able to withstand questioning.

To me the most interesting point raised was ‘Just Because it is Accessible Doesn’t Make it Ethical.’ In an era where everyone is fighting for freedom of information and to keep the internet as one big giant motherly database, it is important to keep some perspective. We’ve been told for so long that more information is good that we’ve been willingly giving up our privacy and allowing ourselves to unknowingly be part of hundreds of surveys everyday. (Here’s an article in the New York Times that details just exactly how much information Facebook has on its users.) One could argue that the results of these data collecting efforts are for our own benefit. Personalized advertisements, smart Google searches, website that retain information about our tastes and habits so they can recommend us exactly what we want but is it worth giving up? Also is it ethical for companies to use this information even if we don’t have a problem with how they got it? These are important questions and we must keep sight of them as the internet continues to remain much like the Wild West. It has existed for too long without any real regulation and its up to everyone to police themselves.


Opening the Political Mind? The effects of self-affirmation and graphical information on factual misperceptions.

This entire paper is an exercise in showing us how to use data to prove a hypothesis. In this case, the authors believe (on the basis of psychological research) that people are unlikely to change their beliefs even after receiving information that provides evidence to show they might be wrong. The reason for this is that it is not just their beliefs that are challenged but their self worth. The researchers also hypothesize that people are more likely to believe graphical representations of information rather than the same information delivered in text. They set out to prove these hypotheses using three different tests with different sets of people (that mirror the general make up of the population.)

In the first test, the researchers only test the effect of boosting a person’s feelings of self worth on their readiness to accept a view that is contrary to theirs. For this test, they used the question of whether the surge in troops in the Iraq War had been successful, a hotly debated topic at the time with democrats and Republicans coming down on opposite sides of the debate.

The second experiment has them testing the effect of viewing ‘counter-attitudinal’ information in graphical form versus textual representation. In this case, the question in point is President Obama’s performance on the nation’s economy.

The last test observes the effect of both self-worth and graphical representation on making people more open to changing their views. For this test, the researchers decided to go with a more homogenous group, with all the participants identifying themselves as Republican or Republican leaning. This was done to better illustrate the results of the experiment, where information on the credibility of global warming was represented in varying forms to a group that has almost uniformly, historically dismissed it.

In all three tests, the researchers found that indeed by boosting a persons feeling of self-worth they were more likely to accept the information given to them, publishing their result in percentage changes. The tests were also able to show that people were more open to graphical representations than text. While the first hypothesis seems obvious, I thought it was interesting to note that the researchers did not have a sound theory when discussing why people are more trusting of charts rather than words based on the same data. They had a lot of theoretical support to show us why people resist contrary opinions but little to suggest why their attitudes can be changed by simply rearranging the visual representation.


  SEM says:

    Nice article about the value/implications of sharing data (online or off). You may also be interested in Cory Doctorow’s discussion on the topic, here:

