Edward Tufte: Data Analysis for Politics & Policy, Chapter 1
The reading is an introduction to data analysis and it begins by giving us several caveats about the use of data to draw conclusions. Tufte draws an important distinction between using data to prove something and using it to shed more light on an issue. While there is no doubt that it is used for the former, it can often be more effective with the latter.
Once that has been established, Tufte takes us through the building blocks of data analysis, how we need to start with a question and have some sort of theory or hypothesis from which to begin. There is an explanation of the different types of variables and how important it is to identify one from the other in order for the analysis to be credible.
Along the way, there are several pitfalls that Tufte identifies as common mistakes made when drawing conclusions from data. These include inferring a causation when there is in fact merely an association and not having a sufficient controlled comparison.
To illustrate the methodology, data is used to analyze the relationship between automobile inspections and the number of fatal car accidents. We start off with hypothesizing that states that have mandatory, thorough auto-mobile inspections will have a lower fatality rate than those that don’t. When we tally up the data, we can clearly see that indeed this is the case.
However, the analysis doesn’t just end there, the author’s point is to illustrate that once a base has been established and the model used for analysis proves to hold up under rigorous testing, we can then go back and change the variables, come up with new hypotheses or use different conditions to get the most out of the data.