Date | Topic | Reading | Notes |
---|---|---|---|
Jan 16 | Introduction; linesearch | Ch. 1-2 of Givens+Hoeting | See Sun and Yuan (2006) for further details on convergence analysis. Berland's notes on automatic differentiation. Mahsereci and Hennig (2016) on Bayesian linesearch. HW: derive the convergence rate of the secant method. |
Jan 23 | Choosing search directions: Newton, generalized linear models, inexact Newton, quasi-Newton, Fisher scoring, BFGS. Exploiting special structure to solve Newton linear equations more efficiently: banded, sparse, low-rank, block-structured (etc.) matrices | See Vandenberghe's notes for some further background | |
Jan 30, Feb 6 | Conjugate gradients. Preconditioning. Toeplitz, circulant, and Kronecker matrices. Application: Gaussian processes | Shewchuk (1994); see Chan and Ng (1996) on PCG for Toeplitz systems. | See Rasmussen and Williams (2006) for more background on GP regression. Also notes by John Cunningham. Rahimi+Recht '07 and Fastfood on random features; Drineas + Mahoney '16 on randomized linear algebra. HW: code up a GP regression. |
Feb 13, 27, Mar 6 | Constrained and non-smooth optimization: convex functions; interior point methods. Convex duality and KKT conditions. Linear, quadratic, and semidefinite programs. | Boyd and Vandenberghe, ch. 3-5 | |
Feb 20 | Expectation maximization and variational inference | Dempster et al (1977), Neal and Hinton (1999), Blei et al (2016) | Guest lecture by Gonzalo Mena |
Mar 13, 20 | No class | Spring break | |
Mar 27 | LASSO methods. Some advanced topics: proximal methods, dual decomposition, convex relaxation | Efron et al (2004), Zou et al (2007), Friedman et al (2010), Bradley et al (2011), Tibshirani et al (2012) | More reading: Bach et al (2011), Boyd et al (2011) |
Apr 3 | 2-minute project idea presentations | ||
Apr 3, 17 | Graphical models; dynamic programming; message passing | Rabiner tutorial, Wainwright lecture notes | Background: Wainwright and Jordan (2008), MP and AMP notes by A. Maleki |
Apr 10 | No class | ||
Apr 24 | Monte Carlo. Rejection sampling; importance sampling; control variates. Metropolis-Hastings; Gibbs sampling. Hamiltonian Monte Carlo; Bouncy particle sampler. MCMC diagnostics. Rao-Blackwellization. Adaptive simulated tempering. | Ch. 1-7 of Robert and Casella; Neal (2010) | Background: Devroye (1986); Hoffman and Gelman (2012), Pakman and Paninski (2013), Bouchard-Cote et al (2015+), Park and Casella (2008), Neal (2003), Doucet (2010), Dellaportas and Kontoyiannis (2012), Carlson et al (2016), D. Wilkinson's blog on "exact approximate" MCMC. Also, see this nice video. |
May 1 | No class | Study days | |
May 8 | Project presentations | Send me your report as a .pdf. |