Ross Ihaka’s course on information visualization

Jouni pointed me to this course on information visualization by Ross Ihaka (one of the original authors of R).

It looks great (and should be helpful for me in preparing my new course in statistical graphics next spring). My only complaint is that it focuses so strongly on techniques without any theoretical discussion of how graphical methods relate to statistical ideas such as model checking and exploratory data analysis. (This is a particular interest of mine.)

I’ll have to look over the notes in detail to see what I can learn. I use pretty sloppy programming techniques to make my graphs–I always have to do a lot of hand-tuning to get them to look just how I want–and I think Ihaka’s more systematic approach could be helpful.

In the meantime, a few picky comments

I started off by reading the lecture on “good and bad graphs.” The bad graphs are pretty bad, and the good ones are much better, but I think the good ones could be better still:

– The improved graph of “age structure of college enrollment” (page 11 of the slideshow) could be further improved by cleaning up the y-axis and just having tickmarks at 25, 30, and 35. (Or, in the bargraph on page 12, y-axis labels at 0, 10%, 20%, 30% would be enough.) The graphs could be improved far more by including data since 1976, or even better by having little age histograms for each year, and displaying a series of these (the “small multiples” idea of Tufte, 1990).

Yes, I recognize that these latter graphs would require additional data–but sometimes that’s the point, right?

– The improved graph of “earnings per share and dividends” (page 14) seems context-free. I guess a y-axis would help, along with removal of the little numbers inside the bars, an improved x-axis, and of course additional data.

– The improved graphs of “faculty size” and “proportion of female students” (pages 17 and 18) are confusing, because the order changes in the 2 graphs. I’d prefer a single graph with 3 panels from left to right: left panel has the names of the faculties (in decreasing order of faculty size), middle panel has the faculty size (as a dotplot), and right panel has percentage of female students (as a dotplot).

– I don’t see why the graph on “required fuel economy standards” (page 21) needs to go to 0, and I don’t think the bars make sense here. (To me, bars signify mass; the area of the bar representing “how much,” but that doesn’t make sense with fuel economy.) I think a lineplot would be better, also with much more data (going back before 1978 and after 1985). With more data, the x-axis could have tick marks every 10 yrs, not every yr.

– I think the “purchasing power” graph on page 25 would be better off as a time series (a line plot). Also, why bother with the little numbers on top of the bars?

– The “median net income” graph on page 29 is misleading because it is not adjusted for the consumer price index. It should be redone in so-called “constant dollars.” (Yes, these are not perfect, but it’s better than no adjustment at all.) And the x-axis should be labeled every 10 or 20 years. To me, labeling every 5 years is overkill and just makes it harder to follow or find landmarks.

Summary

I don’t mean to be negative. This looks like a great course. I just thought it would be fun to be picky and try to see if there are details where Ihaka and I might disagree. Where we do disagree on graphics, I’m probably wrong.

1 thought on “Ross Ihaka’s course on information visualization

Comments are closed.