Data Inquiries – Week 3

The reading you were asked to do during the first week1 Chapter 2 from Tufte’s “Visual Explanations” concluded with a list of general principles for reasoning about statistical evidence. During the first two weeks, we have focused on the first of these principles: documenting the sources and characteristics of the data. We have explored how to use tools as GitHub and R markdown; we have emphatized the importance of writing as a means to “record what might otherwise be forgotten”; we have spent some time digging into the GitHub pages to understand the meaning of all the variables we were presented with. This week we want to focus on the second principle: insistently enforcing appropriate comparisons. The primary medium we are going explore with this goal is graphical displays.

  1. Get familiar with how our mind works in processing visual information reading pages 39-49 from Per Mollerup (2015) Data Design. [Required]

  2. Learn about the concept of “data ink” reading pages 91-95 from Tufte (1983) The Visual Display of Quantitative Information. Discover how data-ink can be maximized using “small multiples” reading pages 161-162, 167-172 of the same book (We have scanned the entire chapter for continuity). [You can decide to get back to this later]

  3. Watch the video How humans see data, a lecture by John Rauser. [You can decide to get back to this later]

  4. Read The Fundamental Principles of Analytical Design from Tufte (2006) Beautiful Evidence. [Required]

  5. Read the EpidemicComparisons document, which contains a number of references on ggplot2. Create two graphical displays that illustrate characteristics of the COVID-19 epidemic in Italy and the USA. See more precise directions in that document. Be ready to share the displays and the code you use to create them. [Required]

You will notice that—between the readings here and the numerous links in the EpidemicComparisons document which introduces ggplot2—there is a lot of material thrown your way in this pre-work. It might feel a bit overwhelming to go through all of this, on top of your project. So, two things to keep in mind.

First, we do not expect you to learn all the content of these materials during this week. We are using this as a place where we can concentrate our suggestions related to graphic desing, plot construction, and software tools to implement these. We are also trying to provide suggestions so that each one of you, with your diverse levels of familiarities with all these tools, might find something challenging. For this wedensday, concentrate on the required readings and on creating the two displays, so that we can have an effective discussion. Feel free to come back to the other materials during the rest of the summer.

Secondly, this work is not supposed to be on top of your project. While we are using the COVID-19 data in the discussion to have a common starting point, the learning you are doing on insistently enforcing appropriate comparisons and graphical displays is supposed to enhance your projects. For example, next week you will give your first presentation. It is quite likely that you will have some graphical display of sorts in it: see if what you are reading this week can help you make that display more effective. Remember the importance of comparison and make sure that you provide plenty.