Influential statisticians

Seth lists the statisticians who’ve had the biggest effect on how he analyzes data:

1. John Tukey. From Exploratory Data Analysis I [Seth] learned to plot my data and to transform it. A Berkeley statistics professor once told me this book wasn’t important!

2. John Chambers. Main person behind S. I [Seth] use R (open-source S) all the time.

3. Ross Ihaka and Robert Gentleman. Originators of R. R is much better than S: Fewer bugs, more commands, better price.

4. William Cleveland. Inventor of loess (local regression). I [Seth] use loess all the time to summarize scatterplots.

5. Ronald Fisher. I [Seth] do ANOVAs.

6. William Gosset. I [Seth] do t tests.

My data analysis is 90% graphs, 10% numerical summaries (e.g., means) and statistical tests (e.g., ANOVA). Whereas most statistics texts are about 1% graphs, 99% numerical summaries and statistical tests.

I think this list is pretty reasonable, but I have a few comments:

1. Just to let youall know, I wasn’t the Berkeley prof who told Seth that EDA wasn’t important. I’ve even published an article about EDA. That said, Tukey’s book isn’t perfect. I mean, really, who cares about the January temperature in Yuma?

2, 3. I agree that S and R are hugely important. But if they weren’t invented, maybe we’d just be using APL or Matlab?

4. Cleveland also made important contributions to statistical graphics.

5. I’ve written an article about Anova too, but at this point I think of Fisher’s version of Anova as an excellent lead-in to hierarchical models and not such a great tool in itself. I think that psychology researchers will be better off when they forget about sums of squares, mean squares, and F tests, and instead focus on coefficients, variance components, and scale parameters.

6. I don’t really do t-tests.

P.S. I wouldn’t even try to make my own list. As a statistician myself, I’ve been influenced by so many many statisticians that any such list would run to the hundreds of names. I suppose if I had to make such a list about which statisticians have had the biggest effect on how I analyze data, it might go something like:

1. Rubin: He taught me applied statistics and clearly has had the largest influence on me (and, maybe, on many readers of my books)

2. Laplace/Lindley/etc.: The various pioneers of hierarchical modeling and applied Bayesian statistics

3. Gauss: Least squares, error models, etc etc

4. Cleveland: Crisp, clean graphics for data analysis. Although maybe if Cleveland had never existed, I’d have picked this up from somewhere else

5. Fisher: He’s gotta be there, since he’s had such a big influence on the statistical practice of the twentieth century

6. Jaynes: Not the philosophy of Bayes stuff, but just one bit–an important bit–in his book where he demonstrated the principle of setting up a model, taking it really seriously, looking hard to see where it doesn’t fit the data, and then looking deeply at the misfit to see what it reveals about how the model to see how it could be improves.

But I’m probably missing some big influences that I’m forgetting right now.

20 thoughts on “Influential statisticians

  1. I imagine that a few decades or so down the road Frank Harrell will make several people's short lists, although his influence is pretty much inferential but notedly frequentist.

    I mention him because (1) I think it true; and (2) I find it quite interesting that because of their quite different respective perspectives and contexts, he and Tukey offer diametrically opposed philosophies (about looking at the data: Tukey yes; Harrell no), yet in my line of work (and I imagine in many others) both have **so** much good to say.

  2. No question Don Rubin. Somebody really should put together a collected works book sooner than later. I keep stumbling across papers that I can't believe I didn't read earlier.

    While not practical, I'd add Savage, Jaynes, Berger, and Roberts, as one group. The philosophical perspective definitely gives me the confidence that I'm on sturdy ground when I do applied work.

  3. Stigler – for Stiglers law which implies we will likely get this stuff largely wrong.

    For instance – "the principle of setting up a model, taking it really seriously, looking hard to see where it doesn't fit the data, and then looking deeply at the misfit to see what it reveals about how the model to see how it could be improves" was in John Dewey's Theory of Inquiry if not many other places in the statistical and non-statistical literature. Though it likely needs constant repeating and Jaynes may have done a notably good job making it salient.

    Though being at David Cox's 80th festcreft the breadth and depth of his work from the moderate subset of his co-authors was almost unbelievable.

    Which if I recall correctly was simmilar to Don Rubin's comment about him in the late 80,s early 90,s

    Keith
    p.s. of course Fisher _did_ the t-test stuff for Gosset – perhaps in exchange for Gosset showing him how to brew good beer in his basement

  4. I'm glad someone added the inventor of GLMs (John Nelder, though Wikipedia also lists Robert Wedderburn).

    In the realms I work in, Shannon has had a huge influence (info theory in general and noisy channel models/decoding in particular).

    But what about the inventor of hierarchical models? Is there one, or is it one of those constantly reinvented things? I think most of this stuff would've been invented by now if the individuals we're citing hadn't done it.

    What about Metropolis (and/or Hastings)? Without MCMC methods, where would we be?

    Speaking of MCMC, how about Markov himself?

    What about Kolmogorov? Didn't he basically lay the foundations for probability theory? Maybe he doesn't count as a statistician, since his contributions are way deeper when you consider Kolmogorov complexity and its relation to the foundations of information and set theory.

  5. I kind of agree and disagree here:

    " … at this point I think of Fisher's version of Anova as an excellent lead-in to hierarchical models and not such a great tool in itself. I think that psychology researchers will be better off when they forget about sums of squares, mean squares, and F tests, and instead focus on coefficients, variance components, and scale parameters."

    An awful lot of experimental psychologists (which I guess captures some of Seth's work) work with designed experiments that reduce to very simple ANOVAs or t tests (even if you set them up as multilevel models). The problem is that psychologists have a tendency to shoehorn everything into the ANOVA model when a more flexible and powerful tool is available. The book I'm writing on at the moment is (in part) an attempt to bridge between ANOVA and multilevel models for psychologists and other experimentalists in the human sciences.

    If it works I'm hoping that readers will be able to go from traditional tools taught in psychology to looking at more sophisticated ones – such as those in Gelman & Hill (2007).

    Thom

  6. Thom writes:

    “An awful lot of experimental psychologists (which I guess captures some of Seth's work) work with designed experiments that reduce to very simple ANOVAs or t tests (even if you set them up as multilevel models). The problem is that psychologists have a tendency to shoehorn everything into the ANOVA model when a more flexible and powerful tool is available. The book I'm writing on at the moment is (in part) an attempt to bridge between ANOVA and multilevel models for psychologists and other experimentalists in the human sciences.''

    I agree with the first sentence, and suspect Thom is referring to an entirely different group of “psychologists'' for the second. Moreover, there is nothing “awful'' about the first group: they are trying to do experimental science (and do quite well, imho), much as Fisher was indeed advocating. For that group, most of the debate here is at best interesting, where not irrelevant. EXCEPT, as Gelman notes in his ANOVA piece, what even the experimentalists mean by “random'' factors.

    And here, as much as I agree with most of what Gelman writes in his ANOVA paper, I have to disagree with his dismissal of design (i.e., the intent of the research) as irrelevant, even when consideration of such changes nothing about the df structure. In experimental designs, it is precisely these considerations that change everything (as Fisher emphasized). So, assuming I am missing something (not unlikely, as most of the Bayesian stuff struck me as boringly irrelevant for experimental designs), what exactly, as an experimental psychologist, have I missed?

  7. Interesting!
    No one seems to have listed what might be called personal influences – that is, professors they had. Of my professors, the one who really influenced me was Herman Friedman.

  8. Bob: keeping in mind Stigler's law, I would point to

    In 1839 Bienayme had remarked that the relative frequency of repeated samples of binary outcomes often show larger variation than indicated by a single underlying proportion and proposed
    a full probability-based random e¤ects model (suggested earlier by Poisson) to account for this.

    Interestingly (for other posts on this blog) Fisher revisited this in 1935 (meta-analysis of agricultural trials) insisting on a non-parameter random effects model, Cochran (Rubin's advisor) "cautiously" extended Fisher's work to the now Normal-Normal random effects/heirarchical modelin 1937(seemingly almost with an apology to Fisher for possibly not knowing better). Some of this and references are in J R Soc Med 2007:100:579-582 and more on http://www.jameslindlibrary.org

    Was wondering again, when reading Andrew's posted course notes (really interesting) why Fisher insisted on the non-parameterics for random effects – perhaps not knowing how to check such assumptions (not sure anyone's really completely sure yet).

    Fisher came to mind because Andrew's comment "Confusion between quantities of interest and inferential summaries" really sounds a lot like Fisher

    Keith

  9. How about adding Enrico Fermi, Stanislaw Ulam and John von Neumann to the list for inventing and developing the Monte Carlo method?

    The (apocryphal?) story goes that Ulam was recuperating in a hospital, playing Solitaire, when he hit on the idea of using the Monte Carlo method to estimate the percentage of winning hands. Ulam and von Neumann then went on to develop importance sampling, rejection sampling, and MCMC.

  10. Not sure how many inventorS of GLMs there actually was.

    Real history is tough (too tough for me) and there is that seemingly outrageous outburst by Kempthorne, in an RSS paper I believe, that one would have to consider in light of it involving work of a recently deceased graduate student of Kempthorne's …

    But then, Stigler's law again, and again

    Keith

  11. John Vokey:

    "Moreover, there is nothing "awful'' about the first group"

    That might be an idiomatic thing – from my side of the Atlantic it just means 'a great many'.

    I'm not sure I'm referring to an entirely different group of "psychologists'' for the second. Lots of experimental designs in some fields would still benefit from multilevel models (e.g., in psycholingustics – though they are catching on there). In any case there is some overlap! They are also often useful for dealing with missing data, time-varying covariates and increasing statistical power even in superficially simple experimental designs.

    I also think Bayesian and likelihood methods are useful for experimental designs (e.g., Rouder et al., 2009). However, the Bayesian methods useful there are perhaps different from the methods Andrew is advocating. I'm kind of persuaded that Bayesian null hypothesis tests are useful for experimental designs in a way that they aren't for Andrew's work (where Ho is implausible).

    I'd also add that multilevel models are inherently Bayesian in the way they approach modeling, so (as I've found) once you start to use multilevel models everything begins to look a bit more Bayesian …

  12. John – Explicitely why are multilevel models are inherently Bayesian?

    Given my working definition of classical or frequentist statistics as trying not to do too badly avoiding an explicit use of a prior – I am looking for something directly about multilevel – perhaps about the type of probability of random effects (aleatory versus epistimological – my previous post re: Scholorpedia Bayes entry).

    There is a talk by Efron from the 2009 O-Bayes meeting arguing that Bayes pools in extra-study information – so if study means just an individual study rather than the set of relevant studies – that would be one explicit argument (not a strong one though)

    Keith

  13. Very few mentions personal influences. Mine first professors were Tore Schweder and Nils Lid Hjort.

    Then there is the phenomenal book by Box, Hunter & Hunter "Statistics for Experimenters", which really teaches to look at the data, and to patent the outliers. Box&Cox transformations were a big influence.

    By their book MASS Venables & Ripley have influenced statistics hugely, and I am sure Nassim Nicholas Taleb & Mandelbrot will be felt more and more into the future!

  14. Keith:

    "John – Explicitely why are multilevel models are inherently Bayesian?
    Given my working definition of classical or frequentist statistics as trying"

    I think you were replying to my comment, not John's. I'm probably the last person to make the case, but modeling the random coefficients in a MLM as probability distribution is regarded as a Bayesian approach (e.g., by Snijders & Bosker) and it is argued that residuals are posterior predictions of the model rather than residuals in the classical sense.

Comments are closed.