## Estimation of entropy and mutual
information

Published as
Neural
Computation 15: 1191-1254

We present some new results on the nonparametric estimation of entropy
and mutual information. First, we use an exact local expansion of the
entropy function to prove almost sure consistency and central limit
theorems for three of the most commonly used discretized information
estimators. The setup is related to Grenander's method of sieves and
places no assumptions on the underlying probability measure generating
the data. Second, we prove a converse to these consistency theorems,
demonstrating that a misapplication of the most common estimation
techniques leads to an arbitrarily poor estimate of the true
information, even given unlimited data. This "inconsistency" theorem
leads to an analytical approximation of the bias, valid in
surprisingly small sample regimes and more accurate than the usual
formula of Miller and Madow over a large region of parameter
space. The two most practical implications of these results are
negative: (1) information estimates in a certain data regime are
likely contaminated by bias, even if "bias-corrected" estimators are
used, and (2) confidence intervals calculated by standard techniques
drastically underestimate the error of the most common estimation
methods.

Finally, we note a very useful connection between the bias of entropy
estimators and a certain polynomial approximation problem. By casting
bias calculation problems in this approximation theory framework, we
obtain the best possible generalization of known asymptotic bias
results. More interesting, this framework leads to an estimator with
some nice properties: the estimator comes equipped with rigorous
bounds on the maximum error over all possible underlying probability
distributions, and this maximum error turns out to be surprisingly
small. We demonstrate the application of this new estimator on both
real and simulated data.

Reprint (900K, pdf) | Code implementing estimator introduced
here

Related work on estimating
information-theoretic quantities | Liam Paninski's home