01:00
Talk at Harvey Mudd College
mdogucu.github.io/harvey-mudd-25
2025-03-29
Q1. What is data representation?
Q2. How does one make good data visualizations? (and how does one avoid making bad or ugly data visualizations)
North Circumpolar Region from the Dunhuang Star Chart circa 649-684 CE.
Recommended reading
Funkhouser, H. G. (1937). Historical Development of the Graphical Representation of Statistical Data. Osiris, 3, 269–404. Chapter 2 is on The Origin of the Graphic Method.
01:00
Assessed value of household and kitchen furniture owned by Black people in Georgia.
20th century navigational chart from Kwajalein Attoll, Marshall Islands, Micronesia on display at Bower Museum in Santa Ana. Photo by Mine Dogucu.
Visualization by Mona Chalabi.
Wanda Díaz-Merced is a Puerto Rican astronomer known for using sonification while studying stars. She is the director of the Arecibo Observatory.
Same-sex marriages in Buenos Aires City by Macarena Zappe
COVID related deaths table by the Economist
Howardena Pindell’s Four Litte Girls
LA Metro Rail map
Data representation refers to the way we structure data in a way to make it easier for us to understand the trends, patterns, relationships that are found in the raw data.
Data visualization is one way of representing data. We have also seen data sonification and data in tactile format (e.g., the sticks).
Tables, maps, charts, plots, infographics are some ways of visualizing data.
The coordinate system, length, width, area, volume, color, and, shape are ways we map data as visual elements.
Different tools (e.g., pen, paper, digital platforms, software, physical objects) can be used to represent data.
Data representations do not necessarily have to be made by a data scientist but you need to understand data science, domain discipline, and art.
Data representations can be used for exploratory reasons and explanatory reasons.
Tip
Whenever possible start the axis at zero.
Tip
Use lining and tabular fonts for numbers.
Many design decisions go into making a data visualization. The following example is from one of my favorite data visualization experts Cara Thompson shared with CC-BY license.
Tip
Do not rely on software defaults for font size, font type, colors, labels, text alignment, legend, etc. without intention.
The video shows use of a screen reader briefly.
Chart type
Type of data
Reason for including the chart
Link to data or source (not in alt text but in main text)
Description conveys meaning in the data
Variables included on the axes
Scale described within the description
Type of plot is described
In the Bayes Rules! book, authors raise the following questions for model fairness. We can extend these to data visualizations.
How was the data collected? By whom and for what purpose? How might the results of the analysis, or the data collection itself, impact individuals and society? What biases or power structures might be baked into this analysis?
Since you are not a cat, for Datathon think outside the box PLEASE
Slides at mdogucu.github.io/harvey-mudd-25.
Source code for slides at github.com/mdogucu/harvey-mudd-25.
minedogucu.com
mdogucu
mastodon.social/@MineDogucu
bsky.app/profile/minedogucu.com
minedogucu