Foundations of Machine Learning 2018/19
African Masters in Machine Intelligence (AMMI) at AIMS Rwanda
This course runs as part of the African Masters in Machine Intelligence (AMMI) at the African Institute for Mathematical Sciences (AIMS), Rwanda.
Syllabus
Syllabus
Part 1: Mathematical Foundations
Part 1: Mathematical Foundations
- Linear Algebra (MML book chapter)
- Groups
- Vector spaces
- Linear independence
- Basis
- Coordinate representation
- Basis change
- Linear mappings
- Eigenvalues
- Analytic Geometry (MML book chapter)
- Norms and inner products
- Distances and angles
- Orthogonal projections
- Vector Calculus (slides, MML book chapter)
- Scalar differentiation
- Partial derivatives
- Jacobian
- Chain rule
- Derivatives of matrices w.r.t. matrices
- Gradients in a multi-layer neural network
- Statistics and Probability Theory (MML book chapter)
- Optimization (MML book chapter)
- Gradient descent
- Stochastic gradient descent
- Momentum
- Constrained optimization
Part 2: Machine Learning
Part 2: Machine Learning
- Graphical Models (slides, Chris Bishop's book chapter)
- Directed graphical models
- Undirected graphical models
- D-separation
- Dimensionality Reduction with Principal Component Analysis (slides, MML book chapter)
- Maximum variance perspective
- Projection perspective
- Key steps of PCA in practice
- Probabilistic PCA
- Other perspectives of PCA
- Linear Regression (slides, MML book chapter)
- Maximum likelihood estimation
- Maximum a posteriori estimation
- Bayesian linear regression
- Distribution over functions
- Model Selection (slides, MML book chapter)
- Cross validation
- Information criteria
- Bayesian model selection
- Occam's razor and the marginal likelihood
- Gaussian Process Regression (slides, GPML book)
- Model
- Inference with Gaussian processes
- Training via evidence maximization
- Model selection
- Interpreting the hyper-parameters
- Practical tips and tricks when working with Gaussian processes
- Bayesian Optimization (slides)
- Optimization of meta-parameters in machine learning systems
- Acquisition functions
- Practicalities
- Applications
- Sampling (slides)
- Monte Carlo estimation
- Importance sampling
- Rejection sampling
- Markov chain Monte Carlo
- Metropolis Hastings
- Slice sampling
- Gibbs sampling
- Density Estimation with Gaussian Mixture Models (slides, MML book chapter)
- Mixture models
- Parameter estimation
- Implementation
- Latent variable perspective
- Classification with Logistic Regression (slides)
- Logistic sigmoid and as a posterior class probability
- Implicit modeling assumptions
- Maximum likelihood estimation
- MAP estimation
- Probabilistic model
- Laplace approximation
- Bayesian logistic regression
- Information Theory (slides by Pedro Mediano)
- Entropy
- KL divergence
- Mutual information
- Coding theory
- Information theory and statistical inference
- Variational Inference (slides)
- Inference as optimization
- Evidence lower bound
- Conditionally conjugate models
- Mean-field variational inference in conditionally conjugate models
- Stochastic variational inference
- Black-box variational inference for hierarchical Bayesian models
- Gradient estimators
- Amortized inference
- Richer posteriors
References
References
Team
Team
- Marc Deisenroth (Lecturer)
- Kossi Amouzouvi (Tutor, AIMS Rwanda)
- Oluwafemi Azeez (Tutor, CMU Africa)
- Steindór Sæmundsson (Tutor, Imperial College London)
- Pedro Martinez Mediano (Tutor, Imperial College London)