Why don’t more people use interactive graphics?

In a comment to this entry, Antony Unwin writes,

It is great to see someone emphasising the use of graphics for exploration as well as for presentation, but if you want to do EDA you need interactive graphics. It’s a mystery to me why more people do not use them. Perhaps it’s because you have to have fast and flexible software, perhaps it’s because of the subjective component in exploratory work, perhaps it’s because using interactive graphics is fun as well as productive and statisticians are serious people. I’d be grateful for explanations.

I’m in a good position to answer this question, since I probably should be using iteractive graphics but I actually don’t. Why not? I don’t know how to, and I guess I never was forced to learn. But I’m teaching a seminar course on statistical graphics this semester, so this would be a great time for me (and my students to learn).

Resources for getting started with dynamic graphics? (preferably in R)

The class will be structured as a student presentation each week. So, one week I’d like a pair of students to present on dynamic graphics. During the week before, they’d learn a dynamic graphics system, then during the class they’d do a demo, then there’d be some homework for all the class for next week.

Any suggestions on what they should do? Ideally it should be in R (since that’s what we’re using for the course). Antony Unwin suggested iPlots in R and Martin Theus’s package Mondrian. The tricky thing is to get a student ready to do a presentation on this, given that I can’t already do it myself. The student would need a good example, not just a link to the package.

9 thoughts on “Why don’t more people use interactive graphics?

  1. For more controlled interactive graphics I'd suggest GGobi which also has an R interface via (surprise) RGGobi. GeneGobi is a pretty sophisticated example of its use. This is more what I think of when I think of interactive statistical graphics—brushing, plotting and so on. JGR is really more like a front-end. Mondrian is also pretty cool, but doesn't interface with R so your students would have to learn a second system.

  2. Well, there's no way to allow a link to iPlots and Mondrian go unchallenged, so don't forget about ggobi and rggobi. The current version is rather unstable on windows, but we are in the process of releasing a new beta version that is rock-solid, has a streamlined interface with R (rggobi2) and a much tidier GUI. Look for an anouncement on the site in the next week. Ggobi also runs on linux and os x, and it is relatively easy to compile a development version on these platforms if you want to experience the bleeding edge.

    There are also a number of demos showing ggobi in action (as well as a number of multivariate teaching examples), and finally the ggobi book describes general dynamic and interactive graphical technique (it uses ggobi, but the principles can be applied to any graphics package)

    I also have an (unreleased) R package, explore, that uses Rggobi to explore classification boundaries in high dimensions. If you are interested I can email a copy.

  3. Sandy Weisberg has at least a couple of accessible articles on the virtues of interactive graphics for EDA. They include examples about transformations, smoothers, animated three-dimensional graphs, the influence of outliers, etc. One is in the Summer 2005 issue of Chance; the other is in the February 2005 issue of the Journal of Statistical Software (PDF). Both are tied to Arc and XLISP-STAT, but it might now be possible to use R to do some of the things that Weisberg writes about.

    The JSS article is sad: it's part of a special issue that eulogizes XLISP-STAT. Some of the articles — e.g., Weisberg's, Jan de Leeuw's — are also sharp criticisms of R and S.

  4. 1) Interactive graphics are more than dynamic graphics. It is not the movement that is important, it's the interaction.
    2) The Gobi software family has many nice features but does not yet handle categorical data properly. It's particular strength is, of course, projection pursuit.
    3) Mondrian does interface with R in one direction. It uses R as a calculating engine in the background to calculate smooths, densities and other functions via Simon Urbanek's Rserve (stats.math.uni-augsburg.de/Rserve/). This is a great way to link graphics and statistical tools together. What statistics would you like available interactively with your graphic displays?
    4) I have sympathy with the students. There is not a lot of published material and you need to see interactive graphics in action to get an idea of what can be done. In one of his articles George Box writes of how academics teaching statistics can be like teaching swimming without letting the students in the pool. He could have added that it's helpful to see someone swimming first. Why don't you invite Simon Urbanek (now at AT&T) over to give a talk and a demo, Andrew?

  5. Categorical graphics are not GGobi's strength, but they are improving, although we have yet to persuade Heike to implement mosaic plots. It is equally fair to say that Mondrian does not do brushing properly. But you really do need to use a few different interactive/dynamic graphics packages to understand the variety of tools and approaches that are available.

    We are working to make interactive graphics easier to learn by creating videos of GGobi in action, but it is a rather slow process. Any ideas for particular topics would be most welcome.

    You could also invite Debby Swayne (AT&T as well) over to give a talk and a demo about GGobi.

  6. The most important issue in this discussion, to my way of thinking, is how we can teach interactive graphics.  There are three issues.

    In mathematics and mathematical statistics there is a formal language for communication. Interactive graphics supports data analysis and needs strategies and procedures as well as concepts and theory. This is certainly harder to teach and a teacher needs practical experience to do it well.
    Tools do matter. We certainly need software which implements the concepts of selection and linking and direct manipulation. In principle it does not matter how these concepts are implemented technically, as long as they are accessible and the main purpose of the software – data analysis – is the motivation to write it.
    Interfaces are far more important than most of us like to admit.  Clunky interface are too often the standard nowadays. Interactive graphics software is mostly "just" interface. Finding the right controls and metaphors is very difficult. Talking to users and "wrestling" for the easiest, most intuitive and consistent solution is the only way forward.

    What should Andrew do in his course?

    look at lots of tools, and actually use them for his own problems;
    avoid packaged examples and use real live examples: not iris, crabs and titanic over and over again;
    invite/talk to the experts and look over their shoulders;
    complain if functionality is hard to access, inconsistent or missing – the programmer/designer will eventually give in.

    Teaching is the best way to spread the word, so we must take on the burden!

  7. The reason that more people do not use interactive stats software is that it is hell on earth to get it installed and working properly and figure out its quirks. It was straightforward to install R 2.2.1 but it took me a further two days of struggle and web searching to finally figure out that I needed to use the notation SS[,"colname"] to refer to a column in my CSV datafile. Why the extra comma? I don't know. I suspect the language was designed for people who want to write complex statistical applications rather than for people who want to use the tools to analyze data and present results.

    Next up was Ggobi and Rggobi. Rggobi doesn't work at all for whatever reason. Ggobi starts up, but when I try to load a small data file it crashes without a whimper.

    Finally, I tried Mondrian. This was the best of the lot because after I trimmed down my data to only 7 independent variables, it actually did display charts and I was able to brush the data and do some analysis. However, most of the functionality will not work because it depends on a 3rd-party package called Rserve which is very sensitive to version changes in R. You guessed it, Rserve is one release version behind R.

    *sigh*

    Is it any wonder we stick with tools like Excel or GNUPLOT?

  8. I do a lot of EDA, but find myself not using interactive graphics very much. (What I have available that meets my definition of "interactive graphics" is "brush()" in S-PLUS. While that is old, it's sort of neat, but nonetheless I don't find myself using it very often at all.) What I do do a lot is custom histogram/quantile/scatter plots (& sometimes trellis plots), often involving transforms. I'll sometimes do hundreds of plots a day, looking for things that stand out. So this is almost like interactive graphics. Maybe this is the answer to "why more people do not use them?" — they do, but they're not using "interactive graphics tools". Instead they're using ordinary static plots of data manipulated with powerful expressive tools like S-PLUS and R to accomplish the same (or even more)?

  9. We have just released a new beta version of GGobi for windows. This version is MUCH more stable (and much easy to use) than previous version. If you have been burnt in the past, please do try again as I think you will be pleasantly surprised. (Sorry to hear about your bad experience Michael)

    If you do have problems the mailing list is very friendly and we'll do our best to get your problems solved. We have also recently set up a bug tracker and we are getting better at fixing bugs in a timely fashion.

    Tony, I too use a large number of plots in R (especially using Trellis) to explore my data. However, there are some things that are hard to do with static graphics (ie. what IS that weird point) that are trivial with interactive graphics. The GGobi team is very keen to get more feedback about what other features people want, so if you try out GGobi and think that there is something important missing, do let us know.

Comments are closed.