Krzysztof Burdzy replies to the reviews of Christian Robert and myself

Last week, Christian Robert and I separately reviewed Krzysztof Burdzy’s book, The Search for Certainty, which I characterized as a harmless if misleading discussion of the philosophy of probability. Burdzy sent us his reply, which I will post below, followed by my comments. I am omitting some parts of Burdzy’s comments that are specifc to Christian’s review and not of general interest.

Dear Professors Gelman and Robert,

First of all, thank you for reading my book. I will refer to [your] reviews as [CR] and [AG]. I will refer to my book as [KB]. Since I referred to the foundations of probability as “one of the greatest intellectual failures of the twentieth century”, I should not have expected an enthusiastic reaction from the community or probabilists, statisticians and philosophers. Hence, your criticism is not a great surprise for me. In this reply, I will try to avoid opinions, as much as it is possible in this area–we obviously have different opinions and we all expressed them in public. I will try focus on facts, because this may help the readers of your reviews and the readers of my book.

FACTS

1. I witnessed the following event. Wilfrid Kendall was asked a question at the end of a talk at a conference. He said “My answer will be very aggressive. I totally agree with you.” (He followed with more substantial remarks). My reaction to your reviews will be very aggressive|I totally agree with you. Well, to be honest, I totally agree on one point: “[the book does not make] a significant contribution to the foundations of statistical inference in general and of Bayesian analysis in particular.” ([CR] p. 7). On page vii of [KB], I clearly state my three intellectual goals: (i) Criticism of von Mises and de Finetti, (ii) Presentation of my “scientific theory or probability”, and (iii) Education of scientists about philosophical theories of probability. The subtitle of [KB] (at least implicitly) indicates that the book is about the miscommunication between philosophers and scientists. On p. 199 of [KB] I say that it is not my ambition to reform statistics. I would have never stated my intellectual goals as a desire to make “a significant contribution to the foundations of statistical inference in general and of Bayesian analysis in particular.”

. . .

5. On p. 6 of [CR], the following quote from [KB] is given: “subjective theory does not provide any justification for the use of the Bayes theorem”. A very similar quote appears in the second to last paragraph of [AG]. I find it amusing that [CR] calls my claim “rather nonsensical” while [AG] says “fine by me, but of course nothing new.” I can’t win–if my claim is true, I failed to notice that it was well known; if it is false, I failed to notice that it was nonsensical.

. . .

OPINION

8. On p. 3 of [CR] there is a remark on the “lack of involved examples”. I do not see how involved examples could have changed the perception of my book. I do not have any new frequency or Bayesian statistical methods to propose. I believe that all statisticians, probabilists and other scientists should use (L1)-(L5) to teach the basic principles of probability. Would involved examples make (L1)-(L5) more attractive to you?

9. On p. 6 of [CR] I am criticized for concentrating \on two very specific entries to frequentism and subjectivism, namely von Mises’ and de Finetti’s, respectively, while those are not your average statistician’s references.” I explain on p. 12 of [KB] why I chose the theories of von Mises and de Finetti. Let me repeat and rephrase my reasons. These two theories are more or less complete and more or less logically consistent philosophical theories created by people who are recognized by philosophers as the leading figures in frequency and subjective currents of philosophy of probability. De Finetti and von Mises wrote books that I could study and criticize. There is an implicit suggestion in [CR] that I have chosen the wrong theories to criticize and that statisticians apparently use other philosophical theories. As far as I can tell, statisticians have a multitude of philosophical opinions but that does not mean that these opinions add up to a logically consistent theory. If I ever want to criticize the Catholic theology, I will use the official Vatican doctrine as the target of my criticism. It is unquestionably true that the real religious beliefs of Catholics are quite often different from the Vatican doctrine, but the union of all beliefs of all Catholics does not add up to a logically consistent philosophy (as far as I can tell).

. . .

My comments: I stand by my opinion that, from this statistician’s perspective, Burdzy’s book is unremarkable except in its insistence on its remarkability. I also continue to disagree with his statement that “standard textbooks on chemistry do not discuss subjectivity in their introductions, and so statistical textbooks need not to do that either”; yes, chemistry and statistics overlap–I’ve done some work on toxicology, myself–but overall I think the two fields can think about their textbooks independently. Regarding Burdzy’s point 9 above, he can feel free to criticize von Mises, de Finetti, or even Pope Benedict. None of this has any impact on my work but it could be of interest to others.
Finally, regarding point 5 above, let me emphasize that Christian Robert and I are two different people. in any case, I hope this discussion is helpful to somebody somewhere!

14 thoughts on “Krzysztof Burdzy replies to the reviews of Christian Robert and myself

  1. I've noticed this phenomena alot lately. Philosophers critiquing ideas used in the sciences, usually woefully missing the mark and then making sensational pronouncements about their findings. Which only has the effect of causing alot of scientist to shake their heads and carry on with what they were doing. I can't comment on Krzysztof Burdzy's book and so I'm perhaps missing the mark, but their aren't shortages in examples of this, Jerry Fodor's critique of Natural Selecion being one of the latest.

  2. I think Fodor and Burdzy are quite different cases. Fodor, writing in the London Review of Books, has a larger audience than most biologists do, and writes in a fluid English style, thus making his work a relatively accessible introduction–for better or for worse–to biological ideas. Burdzy, however, has a much smaller audience than the statisticians he criticizes and just happened to get lucky enough to get a ocpy of his book in Christian Robert's hands. On the intellectual level, the two critiques may be similar–I don't know enough about biology to comment–but the social contexts are different.

    A better analogy to Fodor might be Taleb.

  3. Mark Mardell wrote "One of the worrying things about American political discourse is that some (mainly white) people regard any reference to race as "racist" without looking at the content of what is being said." (see this BBC blog). I believe that one of the worrying things about statistical discourse is that some people regard any reference to Bayesian statistics as an attack against Bayesian statistics without looking at the content of what is being said.

    It is hard to summarize my book in 10 words or less but let me try:
    (i) Philosophical theories of probability created in the twentieth century were total nonsense.
    (ii) Statistics (frequency and Bayesian) is a great field of science.
    (iii) One should purge absurd philosophical ideas from statistics, especially teaching of statistics.

    Christian Robert posted a review of my book on the Arxiv. The review is venomous. Robert does not find any value in the substance or presentation of my book. The review contains only a handful of remotely positive remarks. I find this position intellectually inconsistent. If the book is really so bad, why bother to write a review? I offer the following psychological explanation. Robert noticed correctly that my book is a vicious attack and that it refers to Bayesian statistics. Bayesian feelings are apparently so sore that he concluded that my book is a vicious attack against Bayesian statistics. This explains his vicious counterattack.

    What I propose in my book is that that all statisticians should use my set of scientific laws of probability (L1)-(L5) as a didactic tool. This is similar to, say, proposing that the periodic table of elements should be presented as a 3-dimensional structure, not as a 2-dimensional structure. A chemist who rejected such an idea would probably call it "useless". Gelman has a negative opinion of my book and calls it "harmless". "Harmless" indicates to me that Gelman, despite his rational and reasonable attitude in his publications, got some of the siege mentality from his Bayesian colleagues.

    In my book, I characterize de Finetti's philosophical theory as nonsensical. The philosophical theory of von Mises is criticized for being useless. I have some negative things to say about some frequency statistical methods. I say almost nothing negative about Bayesian statistics. It is remarkable that the only full scale review of my book (so far) comes from Robert, a Bayesian statistician, and that it has the character of a nuclear counterattack.

    Chris Burdzy

    P.S. In an earlier comment, Steve writes "Philosophers critiquing ideas used in the sciences…" I am a scientist critiquing philosophy, not a philosopher critiquing science. I suggest that Steve has at least a brief look at my book before offering an opinion.

  4. Chris: As an author myself, I can understand your sensitivity to negative reviews. I feel the same way. Really, though, I think you should feel lucky that Christian happened to end up with a copy of your book so that he and I could review it and give it some publicity that otherwise I doubt it ever would've had.

  5. Chris
    > Philosophical theories of probability created in the twentieth century were total nonsense.

    Believe its still beyond anyones grasp, but simply "discrediting" two of the possibly "not least wrong" contributors? may not be the least distracting way forward.

    My own personal take is that CS Peirce was on a much less wrong track. but didn't get support/funding to write it up coherently, Frank Ramsay picked-up up the torch but passed away soon after (at age 26) and one of the few "notable" philosophers that since had a kick at the can – Ian Hacking, lost interest.

    > I am a scientist critiquing philosophy

    Then I would suggest trying to establish some productive discourse with them – philosophers (as well as statisticians). I don't believe there are many that are both.

    Keith

  6. Andrew: I did notice the publicity and I do appreciate it – thank you!

    I hate to repeat myself but let me explain my bitterness once again. I called the philosophical theories of de Finetti and von Mises "laughable fantasies" in my book so I braced myself mentally for a review coming from a professional philosopher, starting with "Chris Burdzy, a wannabe philosopher who can't tell the difference between ontology and epistemology, wrote a simplistic book …" I would not be happy but at least I would know why I was hit with a nuclear bomb.

    What did I do to Bayesian statisticians? Why does Robert want to burn me at the stake?

  7. Keith,

    I have a feeling that you did not read my book. Please note that the preface, contents and introduction are available for free at the pblisher's Web site and my Web site. The introduction itself can help you understand my position.

    When you said "not least wrong", I think that you meant "least wrong" – a typo. I do not consider the theories of de Finetti and von Mises "least wrong". They are definitely most popular. Let me use this analogy. Fascism and communism were the most dynamic political ideologies in Europe, from Madrid to Moscow, in the middle of the twentieth century. I would call Popper's definition of democracy "minimalist". It seems to me that he was saying that is best to try to achieve a reasonable minimum rather than to be lured by unsubstantiated promises of radical political philosophies. In a sense, this is what I am trying to do in my book in the area of probability. I believe that Popper's "falsifiability" is a reasonable, if somewhat boring idea, that should replace radical philosophical theories of de Finetti and von Mises.

    I have a guess as to why the frequency and subjective theories are the only reasonably popular philosophies of probability. They are the only philosophies that can be taught in a simplified (vulgarized?) way at the undergraduate level. I have no clue how to teach the "logical" philosophy of probability at the undergraduate level, or what meaningful remarks I could make about the "propensity" theory to undergraduates. Also, note that the "classical" theory is doing quite well although I would not call it a philosophical theory at all. I believe that my laws (L1)-(L5) could succeed because they can be easily taught at the undergraduate level. This is what I mean by "repackaging of Popper's idea".

    I do not know enough about Pierce to comment. Ramsey believed in both objective and subjective probabilities. I mention this in Sect. 7.14 of my book and reject this philosophical position (in the same section). I read Hacking. I do not accept his philosophy. More importantly, I do not see any support for his philosophy among statisticians, and I do not see how it could be taught at the undergraduate level.

    Chris

  8. Without launching into a full debate, my points are that
    * I do not see the relevance of the book for my statistical practice, my statistical theory or my statistical thinking. It does not help me in any way to build better representations of the way statistics should proceed [and therefore better procedures]. Indeed, I do not have anything positive to contribute.
    * I would like very much to see the analysis of the book made by a philosopher of sciences or an epistemologist. But it may be that this approach offers no particular
    appeal to such philosophers. The level of the philosophical debate in the book does not strike me as very deep…
    * I do not act as a Bayesian under siege for the very reason above. Furthermore, I think the criticisms against frequentist procedures are similarly useless. Unbiasedness may have a heap of defects but they are not addressed by the book. Similarly, the drawbacks of following a Bayesian approach are open to discussion, but are not included in the book, which ends up being very tolerant towards the imprecision in the definition of the priors or the testing procedures.
    * I wrote a review after spending some time going through the book with a pencil, in order to make use of the time thus spent and include other Bayesians (and non-Bayesians) in the discussion. The book includes nominal criticism of Berger's and Gelman's books, thus the author should have been ready for some reply of sorts. (Actually, the authors should have been contacted before or after the book was published).
    * I do not want to burn anyone at the stake, please! Philosophy is primarily about debate and dialogue, so I see no ill in stating my disagreements. The book is published, is available to anyone who wants to read it, and I obviously do not aim at censoring or putting it to the stake!

  9. Chris: First some advice someone once gave my group – not matter how negative a review someone gives you about your work, pretend not matter how hard or ridiculous that might be – that they were trying to help you.

    > not read my book.
    Right but only after looking at your web stuff and slides from two talks you gave and not being convinced it would be worth my time. What would provide some encouragement would be what xi'an suggested “see the analysis of the book made by a philosopher of sciences”. I am not actively working in that area and the only name that comes to mind is Deborah Mayo who is a philosopher who did some work with David Cox.

    > said "not least wrong"
    My polite way of saying von Mises and de Finetti were not top on my list

    > at the undergraduate level.
    Most in the statistical discipline will have never taken a course in philosophy or done any non-trivial reading of philosophy (80-90% ? ) and so they will be at the undergraduate level. (Also this makes what’s popular to them less important in many ways)

    > reject this [Ramsay’s] philosophical position
    Some modesty may better suit you (and encourage more rather than less to read your book)

    > how it [Hacking] could be taught at the undergraduate level.
    I am guessing you are referring the Hacking’s introductory text book he wrote around 1997-2000. I did sit in on his undergraduate course in 1996/7 and read a draft of the book. The course was well attended, the students were quite happy and they probably learned some worthwhile things – maybe or maybe not enough to critically read your book. I looked into giving the same type of course at another university jointly with a faculty member in the Philosophy department but it fell through.

    So I do believe courses and books on Philosophy and Statistics _can_ do more good than harm.

    Keith

  10. Keith,

    First of all, thank you for your gentle way of steering me towards calm waters – I appreciate it.

    I did not understand you comment about me rejecting Ramsey and lack of modesty. I might be missing something because English is not my native language. I would say that "Millions of Muslims reject Hindu philosophy." Why would this make them immodest?

    Your two remarks about undergraduate teaching indicate a significant misunderstanding between us, and it is clearly my fault. I am not a philosopher and I never teach philosophy at any level, graduate or undergraduate. I am a probabilist and I often teach probability, at graduate and undergraduate levels. Undergraduate textbooks on probability devote a few paragraphs or a few pages to the foundational issues. I devote about 15 minutes of class time to foundational issues when I teach an undergraduate probability class. This is because I think that my primary responsibility is to teach these students how to calculate the variance of the binomial distribution etc.

    I believe that it is possible to present a simplified ("vulgarized"?) version of the frequency theory to undergraduates in 15 minutes. I also believe that the same applies to the subjective theory and to my laws (L1)-(L5).

    I am afraid that whatever I could tell undergraduate students about the logical theory of probability or propensity theory of probability in 15 minutes would sound either trivial or useless or both (despite the fact that these theories are not trivial).

    Chris

  11. Chris – three things about "reject"

    1. In an academic rather than personal believe context "reject" _can_ be taken to mean the work never had any real value.

    2. Given the expected opposition to someone like you who is not philosopher, it could be better for your work if you softened your criticisms of philosophers.

    3. In math, proofs/derivations are either right or wrong and often you can be very sure about when and where. Taking a philosophical theory as a set of connected models, ala George Box "All models are false but some are useful"; getting at ways to make models less wrong and assessing the usefulness of them for what – will always be tentative and in some sense wrong. Strong conclusions are perhaps never warranted.

    Geoff Hinton once said that those who wanted to work in Neural Nets had two choices – learn statistics or make friends with a statistician. He then admitted he did not know which would be easier.

    You may need to make friends with the right philosopher or somehow obtain credibility for knowing enough philosophy for your interests in probability.

    Keith

  12. Keith,

    Thanks for your remarks on "rejection". I still do not have an intuitive feeling for why "rejecting" a theory is an offensive remark, but I do understand your explanations.

    Concerning the last part of your comment, I see that you misunderstood my ambitions, just like Robert did. Robert thought that I tried to revolutionize Bayesian statistics, so he counterattacked. You seem to believe that I am trying to become a recognized philosopher. It is too late for me to become a philosopher, even if I ever had a talent to be one. I have been a mathematician for most of my life and I will always be.

    My ambition is to sell (L1)-(L5) as a standard way to teach probability to undergraduate students. My chance of success is probably low, but if I ever succeed, my success will have nothing to do with "making friends with the
    right philosopher". If we consider de Finetti a philosopher rather than scientist, then he is probably the only philosopher ever who had an impact on teaching probability at the elementary level.

    My book does contain a lot of philosophy. The reason is that if I published a 3-page note suggesting replacing the standard frequency and subjective interpretations of probability with (L1)-(L5), nobody would notice it.

    Chris

  13. I read Burdzky's unpublished "probability is symmetry" book some time ago…

    I was always not really sure what to make of it. He cites lots of the big names in probability theory but his work does not echo that of the most prominent critics of Bayesian statistics e.g. David Freedman.

    His thought experiment I do remember was interesting, but in my view did not represent anything like a decisive blow to Bayesian theory. He describes an interesting scenario in http://www.math.washington.edu/~burdzy/Philosophy

    From a bayesian point of view this is the problem (rephrased, much simplified and intepreted by myself… I have omited some of my reasoning for brevity)

    Urn 1 contains 1000 balls its contents is unknown (but are black and white balls). However when drawing with replacement from this urn it is found that the fraction of times a white ball is drawn is 48% or 47.9%. It isn't spelt out how this could be confused, it also isn't clear how many times this was drawn but reading between the lines it might be 1/2*365*40=7300. Maybe a reasonable reading is to say that this means that there have 3500 white balls drawn and 3800 black. Which makes the fraction of white balls drawn 47.945% which is as close to 47.95 as we can get.

    This can be used to compute a conditional probability for the contents of urn 1 (perhaps using a Polya model) which will be peaked around 479 – 480. The conditional probability for drawing from this urn without replacement is a mixture of hypergeometric distributions margnilizing over the unknown contents of the urn. Note this is precisely where the either 48% or 47.9% becomes critical… the first suggests that the historical record supports the contents of the urn being 479 white the second 480 white…

    Urn 2 contains 490 white balls and 510 black. It is intersubjective to use a hypergeometric distribution with 490 and 510 as the parameters for draws without replacement from this. For a single draw the probability of white is of course just 0.49

    Then 999 balls are drawn from urn 1, and 479 of these are white. The conditional probability of the next ball being white is sensitive to prior belief but will be near 0.49.

    The probability for urn 2 of being white is 0.49.

    Burdzky asks a question which amounts to which urn is more probable of being a white ball. He has deliberatly set up a close call and the answer is sensitive to prior belief.

    He suggests that a Bayesian would fail to condition their belief on the contents of urn 2 on the historical record and adopt a probability of 0.5 for it. Where in reality a Bayesian would update their probability to the contents of the urn to be around 479-480 white balls.

    The question that he asks is a close call, but if instead when the 999 balls are drawn a different number of white balls from 479 is observed then the call will no longer be close. Equally if the historical record was clear cut on the fraction of 479 or 480 being the historical fraction (and how can it not be) then it will also not be a close call.

    The whole paradox is reliant on a couple of things:

    The 'either' 48% or 47.9% thing… which just doesn't have practical meaning. You could say the historical record was 47.95% but this just makes the answer very prior sensitive….

    The idea that a Bayesian would not update their conditional probabilities about urn 1 (so they would be 0.5).

    He then introduces a game theoretic situation, but the assumed behaviour of the Bayesian is incorrect….

Comments are closed.