Date | Topic | Reading | Notes |
---|---|---|---|
Jan 22 | Introduction; linesearch | Ch. 1-2 of Givens+Hoeting | See Sun and Yuan (2006) for further details on convergence analysis. Berland's notes on automatic differentiation. Mahsereci and Hennig (2016) on Bayesian linesearch. HW: derive the convergence rate of the secant method. |
Jan 29 | Choosing search directions: Newton, generalized linear models, inexact Newton, quasi-Newton, Fisher scoring, BFGS. Exploiting special structure to solve Newton linear equations more efficiently: banded, sparse, low-rank, block-structured (etc.) matrices | See Vandenberghe's notes for some further background | |
Feb 5 + 12 | Conjugate gradients. Preconditioning. Toeplitz, circulant, and Kronecker matrices. Application: Gaussian processes | Shewchuk (1994); see Chan and Ng (1996) on PCG for Toeplitz systems. Gardner et al '19 for fast GP inference. | See Rasmussen and Williams (2006) for more background on GP regression. Also notes by John Cunningham. Rahimi+Recht '07 and Fastfood on random features; Drineas + Mahoney '16 on randomized linear algebra. HW: code up a GP regression. |
Feb 19, 26 | Constrained and non-smooth optimization: convex functions; interior point methods. Convex duality and KKT conditions. Linear, quadratic, and semidefinite programs. | Boyd and Vandenberghe, ch. 3-5 | |
Mar 5 | LASSO methods. Some advanced topics: proximal methods, dual decomposition, convex relaxation | Efron et al (2004), Zou et al (2007), Friedman et al (2010), Bradley et al (2011), Tibshirani et al (2012) | More reading: Bach et al (2011), Boyd et al (2011) |
Mar 12, 26 | Expectation maximization and variational inference. Stochastic gradient descent | Dempster et al (1977), Neal and Hinton (1999), Blei et al (2016), Bottou et al (2018) | |
Mar 19 | No class | Spring break | |
Mar 26 | 2-minute project idea presentations | ||
Apr 2, 9 | Graphical models; dynamic programming; message passing | Rabiner tutorial, Wainwright lecture notes | Background: Wainwright and Jordan (2008), MP and AMP notes by A. Maleki |
Apr 16, 30 | Monte Carlo. Rejection sampling; importance sampling; control variates. Metropolis-Hastings; Gibbs sampling. Hamiltonian Monte Carlo; Bouncy particle sampler. MCMC diagnostics. Rao-Blackwellization. Adaptive simulated tempering. | Ch. 1-7 of Robert and Casella; Neal (2010) | Background: Devroye (1986); Hoffman and Gelman (2012), Pakman and Paninski (2013), Bouchard-Cote et al (2015+), Park and Casella (2008), Neal (2003), Doucet (2010), Carlson et al (2016). Also, see this nice video. |
Apr 16 | Sequential Monte Carlo | Doucet and Johansen (2011), Pitt and Shephard (1999) | Further reading collected by A. Doucet; Kantas et al (2014) |
Apr 23, May 7 | No class | Study days | |
May 14 | Project presentations | Send me your report as a .pdf. |