Abstract: Fast low-SNR Kalman filtering, with applications to high-dimensional smoothing

Fast Kalman filtering and forward-backward smoothing via a low-rank perturbative approach

Eftychios Pnevmatikakis, Kamiar Rahnama Rad, Jonathan Huggins, and Liam Paninski

Under review

Kalman filtering-smoothing is a fundamental tool in statistical time series analysis: it implements the optimal Bayesian filter in the linear-Gaussian setting, and serves as a key step in the inference algorithms for a wide variety of nonlinear and non-Gaussian models. However, standard implementations of the Kalman filter require $O(d^3)$ time and $O(d^2)$ space per timestep, where $d$ is the dimension of the state variable, and are therefore impractical in high-dimensional problems. In this paper we note that if a relatively small number of observations are available per time step, the Kalman equations may be approximated in terms of a low-rank perturbation of the prior state covariance matrix in the absence of any observations. In many cases this approximation may be computed and updated very efficiently (often in just $O(k^2d)$ or $O(k^2d+kd\log d)$ time and space per timestep, where $k$ is the rank of the perturbation and in general $k\ll d$), using fast methods from numerical linear algebra. We justify our approach and give bounds on the rank of the perturbation as a function of the desired accuracy. For the case of smoothing we also quantify the error of our algorithm due to the low rank approximation and show that it can be made arbitrarily low at the expense of a moderate computational cost. We describe applications involving smoothing of spatiotemporal neuroscience data.

Preprint here; see also:
- Paninski, L. (2010), Fast Kalman filtering on quasilinear dendritic trees, Journal of Computational Neuroscience 28: 211-28;
- Pnevmatikakis, E. and Paninski, L. (2012), Fast interior-point inference in high-dimensional sparse, penalized state-space models, AISTATS '12.

Here are a couple examples.

In this first example above, the black dot represents the two-dimensional position x(t) of a mouse wandering around its cage. The position of the mouse was recorded simultaneously with the activity of neurons in the hippocampal formation by Pablo Jercog, who graciously shared this data with us. See Kentros et al 2004 or Muzzio et al 2009 for further details on experimental methods. We applied our filtering methods to estimate the place field f[x,t] of a single neuron as a function of two-dimensional position x and time t: as in the simulated one-dimensional example discussed in the paper, f[x,t] represents the expected firing rate as a function of position x and time t, and is therefore explicitly allowed to change as a function of time. The neuron's observed output at time t, y(t), is represented as y(t)=f[x(t),t]+e(t), where e(t) is a random noise source. The top panels display the fast Kalman filter-smoother output at time t given the observations y(1),...,y(T) at the positions x(1),...,x(T), where T is the length of the experiment (about 500 seconds here), while the bottom panels display the posterior marginal variance (the posterior uncertainty about f[x,t]).
In more detail, we downsampled the position and neural firing rate data to .66 Hz. The spike count in each timebin was transformed to obtain the observation y(t); we applied a standard Anscombe transformation to stabilize the variance of y(t). Under the assumption that the spike counts have approximately Poisson noise, this transformation implies that the variance of the output noise e(t) is of order one. We also fixed the dynamics noise covariance so that the prior equilibrium marginal variance of f was on the order 10, in the units shown in the colorbars above; this is a fairly uninformative prior. This left us with just one free hyperparameter tau (obtained from the dynamics matrix, which was proportional to the identity, as in the examples discussed in the paper). We tried tau=T/3 (left panels) and tau=T/10 (right panels), to get a sense of of the dependence of the estimate on tau. In all cases, we used a Gaussian model for e(t), for simplicity. The place field was modeled as a weighted sum of 400 equally-spaced, isotropic Gaussian bumps, with a total resolution of 50x50=2500 pixels. Full forward-backward smoothing required a few minutes on a laptop; the effective rank of the posterior covariance minus the equilibrium covariance was on the order of 20-40 throughout the experiment.

In this second example, the left panel represents the receptive field f[x,t] of a single simulated visual neuron as a function of the two-dimensional visual spatial variable x and time t. The neuron's observed output at time t, y(t), is represented as y(t)=s(t)'f[.,t]+e(t), where e(t) is a random noise source, s(t) is the visual stimulus at time t (not shown here; as in the one-dimensional example discussed in the paper, we used spatiotemporal white noise as the stimulus), and s(t)'f[.,t] denotes the dot product of the stimulus with the receptive field at time t.
The top left panel shows the true receptive field f[.,t]; in this case, the receptive field was chosen to be a difference of two spatial isotropic Gaussian functions, and the position of these Gaussians was varied sinusoidally in time. The middle panels display the fast Kalman filter output at time t given the observations y(1),...,y(t) resulting from the stimuli s(1),...,s(t), and the right panels display the output of the Kalman smoother given the observed data over times 1,...T, with T=400 timesteps in this case. The lower left panel plots a "z-score" statistic (the posterior mean in the upper right panel divided by the square root of the posterior marginal variance in the lower right panel), to give a sense of the signal-to-noise of the output estimate.
The dimensionality of f[.,t] here was 64x64 pixels; tau = 100 timesteps; the power spectrum encoded in the prior covariance C0 fell off with 1/w^2 (where w indexes the frequency of the Fourier basis), as in the one-dimensional example discussed in the main text.

Liam Paninski's home