What is the evidence on birth order and brain cancer?

Bruce McCullough writes,

The probability of getting brain cancer is determined by the number of younger siblings. So claim some scientists, according to an article published in the current issue of The Economist.

I have ordered your book so that I can read more about controlling for intermediate outcomes, but I am not yet confident enough to tackle it myself. Perhaps you might blog this?

I’ll give my thoughts, but first here’s the scientific paper (by Altieri et al. in the journal Neurology), and here are the key parts of the news article that Bruce forwarded:

Younger siblings increase the chance of brain cancer

IT IS well known that many sorts of cancer run in families; in other words you get them (or, at least, a genetic predisposition towards them) from your parents. . . . Dr Altieri was looking for evidence to support the idea that at least some brain cancers are triggered by viruses and that children in large families are therefore at greater risk, because they are more likely to be exposed to childhood viral infections. . . .

Dr Altieri describes what he discovered when he analysed the records of the Swedish Family Cancer Database. This includes everyone born in Sweden since 1931, together with their parents even if born before that date.

More than 13,600 Swedes have developed brain tumours in the intervening decades. In small families there was no relationship between an individual’s risk of brain cancer and the number of siblings he had. However, children in families with five or more offspring had twice the average chance of developing brain cancer over the course of their lives compared with those who had no brothers and sisters at all.

Digging deeper, Dr Altieri found a more startling result. When he looked at those people who had had their cancer as children or young teenagers he found the rate was even higher–and that it was particularly high for those with many younger siblings. Under-15s with three or more younger siblings were 3.7 times more likely than only children to develop a common type of brain cancer called a meningioma, and at significantly higher risk of every other form of the disease that the researchers considered. . . . the mechanisms by which younger siblings have more influence than elder ones are speculative. . . . An alternative theory is that a first child may experience a period when his immune system is particularly sensitive to certain infections at about the age when third and fourth children are typically born. . . .

OK, now my thoughts. There are two issues to address here: first, what exactly did Altieri et al. find in their data analysis, and, second, how can we think about causal inference for birth order and the number of siblings?

What did Altieri et al. find?

The main results in the paper appear to be in Table 2, where the brain cancer risk is slightly higher among people with more siblings. The overall risk ratios, normalized at 1 for only children, are 1.03, 1.06, 1.10, and 1.06 for people with 1, 2, 3, or 4+ siblings, respectively. The table gives a p value for the trend as 0.005, but I think they made a mistake, because, in R:

> x <- 0:4 > y <- c(1,1.03,1.06,1.10,1.06) > summary (lm (y~x))

Call:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.012000 0.019950 50.727 1.69e-05 ***
x 0.019000 0.008145 2.333 0.102

Residual standard error: 0.02576 on 3 degrees of freedom
Multiple R-Squared: 0.6446, Adjusted R-squared: 0.5262
F-statistic: 5.442 on 1 and 3 DF, p-value: 0.1019

The p-value seems to be 0.10, not 0.005.

Stronger results appear in Tables T-1, T-2, and T-3 (referred to in the paper and included in the supplementary material at the article’s webpage). Risk ratios for brain cancer are quite a bit higher for kids with 3 or more younger siblings, and lower for kids with 3 or more older siblings.

In all these tables, results are broken down by type of cancer, but sample sizes are small enough that I don’t really put much trust into these subset analyses. A multilevel model would help, I suppose.

Causal inference for birth order

How to think about this causally? We can think about the number of younger siblings as a causal treatment: if littel Billy’s parents have more kids, how does this affect the probability that Billy gets brain cancer? But how do we think about older siblings? I’m a little stuck here: if I try to compare Billy as an only child to Billy as the youngest of three children, it’s hard to think of a corresponding causal “treatment.”

Thinking forward from treatments, suppose a couple has a kid and is considering to have another. One could imagine the effect of this on the first child’s probability of brain cancer. One can also consider the probability that the second child has brain cancer, but the comparison of the two kids would not be “causal” (in the Rubin sense). This is not to dismiss the comparison–I’m just laying out my struggle with thinking about these things. Similar issues arise in other noncausal comparisons (for example, comparing boys to girls).

The reluctant debunker

Finally, I’m sensitive to Andrew Oswald’s comment that I’m too critical of innovative research–even if there are methodological flaws in a paper, it’s conclusions could still be correct. My only defense is that I’m responding to Bruce’s request: I wasn’t going out looking for papers to debunk.

In any case, I’m not “debunking” this paper at all. I don’t see major statistical problems with their comparisons; I’m just struggling to understand it all.

4 thoughts on “What is the evidence on birth order and brain cancer?

  1. Andrew,

    Thanks for your thoughts.

    I think that you are being too sensitive to Oswald's remark; you have mentioned it more than once. He misrepresents your position (as I see it). He writes: "Your younger readers are constantly getting the subtle message: A POTENTIAL METHODOLOGICAL FLAW IN A PAPER MEANS ITS CONCLUSIONS ARE WRONG."

    Nothing could be farther from the truth. What you are suggesting is that a methodological flaw in a paper implies that the paper is not necessarily correct. (Rarely does any one paper make or break a particular hypothesis; it is the "preponderance of evidence" accumulated by many investigators that confirms or disproves a particular hypothesis.) Nonetheless, a paper that has no methodological flaws should constitute stronger evidence than a paper that has methodological flaws. In this regard, you are doing all a service, rather than a disservice as Oswald suggests, when you point out methodological flaws. I urge you to continue doing so. I find these examples most instructive. It shows that the refereeing process isn't perfect (though many would pretend it is), and that publication in a journal is not the final word on any subject (though many in the media are under the illusion that it is).

    Regards,

    Bruce

  2. I think there is an identifiability problem in birth order analyses. I can think of three different effects you might be interested in:

    Family size (total number of siblings)
    Number of older siblings
    Number of younger siblings

    But these are all linearly related:

    older + younger = total siblings

    so any (log) linear trend in risk with on of these factors is aliased with the other two. Exactly the same problem is found in age-period-cohort analyses in population-based survival analysis, where the effects of birth-cohort are aliased with calender time and age.

    It's not easy to find an appropriate model that captures the effect of "birth order". I would favour one that includes a term for family size (as a factor) and also a trend with number of younger siblings. The trend coefficient gives the effect of swapping birth order with the next youngest sibling.

    Alternatively you could have family size + number of older siblings in the model, which would give you the reciprocal of the risk ratio for the trend, so the model has a nice symmetry. (But the risk ratios for family size would be completely different, underscoring the identifiability problem).

    On another topic, I don't think your attempt to reconsruct the p-value for the trend is very fair, as it ignores information about the standard errors of the odds ratio estimates. This is an example of a general problem: the way relative risks are normally reported in epidemiological papers, the reader never has enough information to calculate the standard error of any risk contrast, except the one chosen by the authors. Easton, Peto, and Babiker (Stat Med 1991;10:1025–35) invented "floating absolute risks" to overcome this problem.

  3. Martyn,

    Yes, with birth order there are more potential comparisons than there are degrees of freedom in the data. I'm not sure how to think about this. Having more younger siblings is correlated with brain cancer, though, that seems interesting?

    Regarding the trend analysis: I don't think that adjusting for the se's of the odds ratio estimates will do much, since (potentially) there is error in the model beyond the binomial variation.

  4. This is speculative and perhaps a little unfair, but a possible mistake might explain the strong results for young children. Perhaps the families are large because upon having a child develop cancer the family choose to have more children.

Comments are closed.