This paper, "Biological versus nonbiological older brothers and men's sexual orientation," by Anthony Bogaert, appeared recently in the Proceedings of the National Academy of Sciences and was picked up by several news organizations, including Scientific American, New Scientist, Science News, and the CBC. As the Science News article put it,
The number of biological older brothers correlated with the likelihood of a man being homosexual, regardless of the amount of time spent with those siblings during childhood, Bogaert says. No other sibling characteristic, such as number of older sisters, displayed a link to male sexual orientation.
I was curious about this--why older brothers and not older sisters? The article referred back to this earlier paper by Blanchard and Bogaert from 1996, which had this graph:
and this table:
Here's the key quote from the paper:
Significant beta coefficients differ statistically from zero and, when positive, indicate a greater probability of homosexuality. Only the number of biological older brothers reared with the participant, and not any other sibling characteristic including the number of nonbiological brothers reared with the participant, was significantly related to sexual orientation.
The entire conclusions seem to be based on a comparison of significance with nonsignificance, even though the differences do not appear to be significant. (One can't quite be sure--it's a regression analysis and the different coef estimates are not independent, but based on the picture I strongly doubt the differences are significant.) In particular, the difference between the coefficients for brothers and sisters does not appear to be significant.
What can we say about this example?
As I have discussed elsewhere, the difference between "significant" and "not significant" is not itself statistically significant. But should I be such a hard-liner here? As Andrew Oswald pointed out, innovative research can have mistakes, but that doesn't mean it should be discarded. And given my Bayesian inclinations, I should be the last person to discard a finding (in this case, the difference between the average number of older brothers and the average number of older sisters) just because it's not statistically significant.
But . . . but . . . yes, the data are consistent with the hypothesis that only the number of older brothers matters. But the data are also consistent with the hypothesis that only the birth order (i.e., the total number of older siblings) matters. (At least, so I suspect from the graph and the table.) Given that the 95% confidence level is standard (and I'm pretty sure the paper wouldn't have been published without it), I think the rule should be applied consistently.
To put it another way, the news articles (and also bloggers; see here, here, and here) just take this finding at face value.
Let me try this one more time: Bogaert's conclusions might very well be correct. He did not make a big mistake (as was done, for example, in the article discussed here). But I think he should be a little less sure of his conclusions, since his data appear to be consistent with the simpler hypothesis that it's birth order, not #brothers, that's correlated with being gay. (The paper did refer to other studies replicating the findings, but when I tracked down the references I didn't actually see any more data on the brothers vs. sisters issue.)
Warning: I don't know what I'm talking about here!
This is a tricky issue because I know next to nothing about biology, so I'm speaking purely as a statistician here. Again, I'm not trying to slam Bogaert's study, I'm just critical of the unquestioning acceptance of the results, which I think derives from an error about comparing statistical significance.