Using experts’ ranges

Doug McNamara writes,

I am preparing for my first year as a graduate student at the University of Maryland in their Department of Measurement, Evaluation and Statistics. I’ve been reading your blog for a few months, and thought I would finally ask a question. So, here it is:

I have some data on number of terrorist/insurgent troops in a country. For some of the cases, the data could not be directly measured; instead, experts on the country in question were surveyed. For these survey responds, the dataset provides a range of possible values for number of troops, with the range usually representing the high and low estimates (rounded to the nearest thousand). For instance, experts have assigned a range of 10,000-15,000 for number of UNITA troops in Angola in 1989.

So, the question is, how do I go about assigning an actual value to those situations where there is a range? Initially, I was thinking about simply using the mean between the high and low values, but I know nothing about the distribution of expert opinions. Alternatively, I could simply assign a random value within the range. A third option would be to run three tests—one where I only use the low values, one where I use the high values and a third where I use the median/random value approach.

I should mention I would like to assign a single value for the simple purpose of running a t-test to see if there is a difference in average number of troops when the group is foreign funded or not.

My reply: Considering this as a statistical problem, you could treat the actual number as missing data and then use a rounded-data likelihood (as in Exercise 3.5 of Bayesian Data Analysis). In your case, however, I’d probably just use the average (or the geometric mean) of the range. I wouldn’t take these ranges very seriously: in general, experts are notorious for giving estimates where the truth falls outside the range of their guesses. So I don’t see you getting anything special from looking at the high and low values as if they were actually upper and lower bounds.

2 thoughts on “Using experts’ ranges

  1. Why not use the range as a confidence interval generated from a normal distribution, but I would say based on skepticism to use a fairly low confidence level, perhaps 50%

    You could then combine the resulting normal distributions in the two groups, and generate probability distributions for fraction of population involved in insurgency as a gaussian mixture model.

    The result wouldn't test whether there actually were more insurgents in foreign funded countries, but rather whether the experts thought there were…

Comments are closed.