Some important topics will be omitted because high-quality solutions are already available in most software. For example, the generation of pseudo-random numbers is a classic topic, but existing methods built in to standard software packages will suffice for our needs. On the other hand, we will spend a bit of time on some classical numerical linear algebra ideas, because choosing the right method for solving a linear equation (for example) can have a huge impact on the time it takes to solve a problem in practice, particularly if there is some special structure that we can exploit.

Deterministic optimization

- Newton-Raphson, conjugate gradients, preconditioning, quasi-Newton methods, Fisher scoring, EM and its various derivatives

- Numerical recipes for linear algebra: matrix inverse, LU, Cholesky decompositions, low-rank updates, SVD, banded matrices, Toeplitz matrices and the FFT, Kronecker products (separable matrices), sparse matrix solvers

- Convex analysis: convex functions, duality, KKT conditions, interior point methods, projected gradients, augmented Lagrangian methods, convex relaxations

- Applications: support vector machines, splines, kriging, isotonic regression, LASSO and LARS regression

Dynamic programming: hidden Markov models, forward-backward algorithm, Kalman filter, Markov random fields

Stochastic optimization: Robbins-Monro and Kiefer-Wolfowitz algorithms, simulated annealing, stochastic gradient methods

Deterministic integration: Gaussian quadrature, quasi-Monte Carlo. Application: expectation propagation

Monte Carlo methods

- Rejection sampling, importance sampling, variance reduction methods (Rao-Blackwellization, stratified sampling)

- MCMC methods: Gibbs sampling, Metropolis-Hastings, Langevin methods, Hamiltonian Monte Carlo, slice sampling. Implementation issues: burnin, monitoring convergence

- Sequential Monte Carlo (particle filtering)

Givens and Hoeting (2005) Computational statistics

Robert and Casella (2004) Monte Carlo Statistical Methods

Boyd and Vandenberghe (2004), Convex Optimization.

Press et al, Numerical Recipes

Sun and Yuan (2006), Optimization theory and methods

Fletcher (2000) Practical methods of optimization

Searle (2006) Matrix Algebra Useful for Statistics

Spall (2003), Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control

Shewchuk (1994), An Introduction to the Conjugate Gradient Method Without the Agonizing Pain

Boyd et al (2011), Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers

Date | Topic | Reading | Notes |
---|---|---|---|

Jan 17 | Introduction; linesearch | Ch. 1-2 of Givens+Hoeting | See Sun and Yuan (2006) for further details on convergence analysis. HW: derive the convergence rate of the secant method. |

Jan 22 | Choosing search directions: Newton, generalized linear models, inexact Newton, quasi-Newton, Fisher scoring, BFGS | See Vandenberghe's notes for some further background | |

Jan 27 | Snow day | ||

Jan 29 | Exploiting special structure to solve Newton linear equations more efficiently: banded, sparse, low-rank, block-structured (etc.) matrices | HW: density estimation via convex optimization and banded Hessians. | |

Feb 3 | Conjugate gradients. Preconditioning. Toeplitz and circulant matrices | Shewchuk (1994); see Chan and Ng (1996) on PCG for Toeplitz systems. | |

Feb 5 | Gaussian process regression | Notes by John Cunningham. See Rasmussen and Williams (2006) for more background on GP regression. HW: code up a GP regression. | |

Feb 10 | Expectation maximization | See Dempster et al (1977) and Neal and Hinton (1999) for further reading. | |

Feb 12, 17 | Constrained and non-smooth optimization: convex functions; interior point methods | Boyd and Vandenberghe, ch. 3-4 | |

Feb 19, 24 | No class | HW: Read and do some problems from Boyd and Vandenberghe | |

Feb 26, Mar 3 | Linear, quadratic, and semidefinite programs. LASSO methods | Efron et al (2004), Zou et al (2007), Friedman et al (2010), Bradley et al (2011), Tibshirani et al (2012) | |

Mar 5 | Convex duality, KKT conditions. Some advanced topics: proximal methods, dual decomposition, convex relaxation, and branch-and-bound | Boyd and Vandenberghe, ch. 5 | Background: Bach et al (2011), Boyd et al (2011), Luo et al (2010); branch-and-bound notes by Rahul Mazumder |

Mar 10 | Graphical models; dynamic programming; message passing; LP relaxations | Rabiner tutorial, Jordan (2004) | Background: Wainwright and Jordan (2008), MP and AMP notes by A. Maleki, LP relaxation notes by Y.-W. Teh |

Mar 12 | Short informal project presentations | ||

Mar 17, 19 | No class: spring break | ||

Mar 24 | Monte Carlo basics. Rejection sampling; Metropolis-Hastings; Gibbs sampling | Ch. 1-7 of Robert and Casella | Background: Devroye (1986). |

Mar 26 | Hamiltonian Monte Carlo | Neal (2010) |
Guest lecture by Ari Pakman. Further background: Hoffman and Gelman (2012), Pakman and Paninski (2013). Also, see this nice video. |

Mar 31 | Guest lecture by Arian Maleki on message passing and approximate message passing | notes | |

Apr 2, 7 | More on Monte Carlo. Adaptive rejection sampling; importance sampling; slice sampling. MCMC diagnostics. Rao-Blackwellization. Control variates. Adaptive simulated tempering. | Background: Park and Casella (2008), Neal (2003), Doucet (2010), Dellaportas and Kontoyiannis (2012), Ranganath et al (2014), Salakhutdinov (2010) | |

Apr 7, 9 | Sequential Monte Carlo | Doucet and Johansen (2011), Pitt and Shephard (1999) | Further reading collected by A. Doucet; Kantas et al (2014) |

Apr 14 | Stochastic gradient methods; Bayesian optimization | Book chapter by Lauren Hannah; Duchi et al (2011), Lacoste-Julien et al (2012), Snoek et al (2012) | |

Apr 16, 23 | Variational inference | Ormerod and Wand (2010) | Additional reading: Hoffman et al (2013), Emtiyaz Khan et al (2013), Ranganath et al (2014) |

Apr 21 | No class | ||

Apr 23 | More deterministic approximations for posteriors: expectation propagation, Gaussian quadrature | Sudderth (2002) | |

Apr 28 | Introduction to Dirichlet process methods | Neal (2000), Teh (2010) | See also the lecture notes on Peter Orbanz's page. |

Apr 30 | No class | ||

May 5 | Project presentations | Send me your report as a .pdf by May 10. |