« Jon Baron on intuitive judgment | Main | This is about political science, in a sense »
June 14, 2005
The difference between "statistically significant" and "not statistically significant" is not in itself necessarily statistically significant
The difference between "statistically significant" and "not statistically significant" is not in itself necessarily statistically significant.
By this, I mean more than the obvious point about arbitrary divisions, that there is essentially no difference between something significant at the 0.049 level or the 0.051 level. I have a bigger point to make.
It is common in applied research--in the last couple of weeks, I have seen this mistake made in a talk by a leading political scientist and a paper by a psychologist--to compare two effects, from two different analyses, one of which is statistically significant and one which is not, and then to try to interpret/explain the difference. Without any recognition that the difference itself was not statistically significant.
Let me explain. Consider two experiments, one giving an estimated effect of 25 (with a standard error of 10) and the other with an estimate of 10 (with a standard error of 10). The first is highly statistically significant (with a p-value of 1.2%) and the second is clearly not statistically significant (with an estimate that is no bigger than its s.e.).
What about the difference? The difference is 15 (with a s.e. of sqrt(10^2+10^2)=14.1), which is clearly not statistically significant! (The z-score is only 1.1.)
This is a surprisingly common mistake. The two effects seem sooooo different, that it is hard for people to even think that their difference might be explained purely by chance.
For a horrible example of this mistake, see the paper, Blackman, C. F., Benane, S. G., Elliott, D. J., House, D. E., and Pollock, M. M. (1988). Influence of electromagnetic fields on the efflux of calcium ions from brain tissue in vitro: a three-model analysis consistent with the frequency response up to 510 Hz. Bioelectromagnetics 9, 215-227. (I encountered this example at a conference in radiation and health in 1989. I sent a letter to Blackman asking him for a copy of his data so we could improve the analysis, but he refused, saying the raw data were on logbooks and it would be too much effort to copy them. We'll be discussing the example further in our forthcoming book on applied regression and multilevel modeling.)
Posted by Andrew at June 14, 2005 12:24 AM
Trackback Pings
TrackBack URL for this entry:
http://www.stat.columbia.edu/~cook/movabletype/mt-tb.cgi/74
Listed below are links to weblogs that reference The difference between "statistically significant" and "not statistically significant" is not in itself necessarily statistically significant:
» A thin red line: the distance between statistical significance and insignificance from Shadow
The difference between "statistically significant" and "not statistically significant" is not in itself necessarily statistically significant, says Andrew Gelman. It’s a message we ought to drill in, hammer down, and drive ho... [Read More]
Tracked on June 14, 2005 12:06 AM
» Statistically Insignificant Significance from Financial Rounds
Caution: Heavy statistical "nerd speak" ahead. If you don't know (or care) what a standard error is, move on and read the latest Lileks piece before the math harshes your mellow.
Often a researcher will conduct two tests - the results from one w... [Read More]
Tracked on June 15, 2005 9:04 AM
Comments
This is a good point, and a good kick-off to discuss power analysis. Point out that one of the researchers had a 'successful' experiment, while the other didn't, because of luck.
I worry when somebody says that their data is unavailable, because it's in logbooks. At some
point, it had to have been entered into a computer. And it should have been entered in sufficient detail to allow checking for errors.
Posted by: Barry at June 14, 2005 9:16 AM.
This always drives my students nuts when we discuss post hoc tests in one-way ANOVA: you know the smallest and largest group means must be significantly different, otherwise the overall ANOVA wouldn't be. But after that, almost anything goes.
I agree with Barry, this is where you introduce the idea of power, and hope for the best.
Posted by: Mike Anderson at June 14, 2005 9:56 PM.
I feel so frustrated: I did not catch it. Would you mind helping my slow brain to cope with those two paragraphs please?
"Let me explain. Consider two experiments, one giving an estimated effect of 25 (with a standard error of 10) and the other with an estimate of 10 (with a standard error of 10). The first is highly statistically significant (with a p-value of 1.2%) and the second is clearly not statistically significant (with an estimate that is no bigger than its s.e.).
What about the difference? The difference is 15 (with a s.e. of sqrt(10^2+10^2)=14.1), which is clearly not statistically significant! (The z-score is only 1.1.)"
I know, I'm not a stats student in the beginning, so maybe I should read a textbook and then ask for help.
Posted by: François at June 15, 2005 11:41 AM.
I am being driven nuts about "significance" and "non-significance". I see in one text, a number below p,.05 is significant and.68 and .46 are also significant.HELP!
Confused
Posted by: Anonymous at September 20, 2005 3:00 PM.