Foundations of Machine Learning 2018/19

African Masters in Machine Intelligence (AMMI) at AIMS Rwanda

This course runs as part of the African Masters in Machine Intelligence (AMMI) at the African Institute for Mathematical Sciences (AIMS), Rwanda.

Syllabus

Part 1: Mathematical Foundations

Linear Algebra (MML book chapter)
- Groups
- Vector spaces
- Linear independence
- Basis
- Coordinate representation
- Basis change
- Linear mappings
- Eigenvalues
Analytic Geometry (MML book chapter)
- Norms and inner products
- Distances and angles
- Orthogonal projections
Vector Calculus (slides, MML book chapter)
- Scalar differentiation
- Partial derivatives
- Jacobian
- Chain rule
- Derivatives of matrices w.r.t. matrices
- Gradients in a multi-layer neural network
Statistics and Probability Theory (MML book chapter)
Optimization (MML book chapter)
- Gradient descent
- Stochastic gradient descent
- Momentum
- Constrained optimization

Part 2: Machine Learning

Graphical Models (slides, Chris Bishop's book chapter)
- Directed graphical models
- Undirected graphical models
- D-separation
Dimensionality Reduction with Principal Component Analysis (slides, MML book chapter)
- Maximum variance perspective
- Projection perspective
- Key steps of PCA in practice
- Probabilistic PCA
- Other perspectives of PCA
Linear Regression (slides, MML book chapter)
- Maximum likelihood estimation
- Maximum a posteriori estimation
- Bayesian linear regression
- Distribution over functions
Model Selection (slides, MML book chapter)
- Cross validation
- Information criteria
- Bayesian model selection
- Occam's razor and the marginal likelihood
Gaussian Process Regression (slides, GPML book)
- Model
- Inference with Gaussian processes
- Training via evidence maximization
- Model selection
- Interpreting the hyper-parameters
- Practical tips and tricks when working with Gaussian processes
Bayesian Optimization (slides)
- Optimization of meta-parameters in machine learning systems
- Acquisition functions
- Practicalities
- Applications
Sampling (slides)
- Monte Carlo estimation
- Importance sampling
- Rejection sampling
- Markov chain Monte Carlo
- Metropolis Hastings
- Slice sampling
- Gibbs sampling
Density Estimation with Gaussian Mixture Models (slides, MML book chapter)
- Mixture models
- Parameter estimation
- Implementation
- Latent variable perspective
Classification with Logistic Regression (slides)
- Logistic sigmoid and as a posterior class probability
- Implicit modeling assumptions
- Maximum likelihood estimation
- MAP estimation
- Probabilistic model
- Laplace approximation
- Bayesian logistic regression
Information Theory (slides by Pedro Mediano)
- Entropy
- KL divergence
- Mutual information
- Coding theory
- Information theory and statistical inference
Variational Inference (slides)
- Inference as optimization
- Evidence lower bound
- Conditionally conjugate models
- Mean-field variational inference in conditionally conjugate models
- Stochastic variational inference
- Black-box variational inference for hierarchical Bayesian models
- Gradient estimators
- Amortized inference
- Richer posteriors

Practicals

References

Deisenroth et al.: Mathematics for Machine Learning
Coursera course on empirical statistics, inner products, orthogonal projections and PCA
Bishop: Pattern Recognition and Machine Learning, 2006
MacKay: Information Theory, Inference, and Learning Algorithms, 2003
Strang: Introduction to Linear Algebra

Team

Marc Deisenroth (Lecturer)
Kossi Amouzouvi (Tutor, AIMS Rwanda)
Oluwafemi Azeez (Tutor, CMU Africa)
Steindór Sæmundsson (Tutor, Imperial College London)
Pedro Martinez Mediano (Tutor, Imperial College London)