Asymptotic theory of information-theoretic experimental design*

Liam Paninski

Neural Computation 17: 1480-1507.

We discuss an idea for collecting data in a relatively efficient manner. Our point of view is Bayesian and information-theoretic: on any given trial, we want to adaptively choose the input in such a way that the mutual information between the (unknown) state of the system and the (stochastic) output is maximal, given any prior information (including data collected on any previous trials). We prove a theorem that quantifies the effectiveness of this strategy and give a few illustrative examples comparing the performance of this adaptive technique to that of the more usual nonadaptive experimental design. In particular, we calculate the asymptotic efficiency of the information-maximization strategy and demonstrate that this method is in a well-defined sense never less efficient --- and is generically more efficient --- than the nonadaptive strategy. For example, we are able to explicitly calculate the asymptotic relative efficiency of the ``staircase method'' widely employed in psychophysics research, and to demonstrate the dependence of this efficiency on the form of the ``psychometric function'' underlying the output responses.
Reprint (270K, pdf)  |  Related work on active learning  |  Liam Paninski's home

*A shorter version of this paper was published as: Paninski, L. (2003). Information-theoretic design of experiments. Advances in Neural Information Processing 16.