Alex Smola – index

Machine Learning Summer School 2002

Abstract

The course begins with an overview of the basic assumptions underlying Bayesian estimation. We explain the notion of prior distributions, which encode our prior belief concerning the likelihood of obtaining a certain estimate, and the concept of the posterior probability, which quantifies how plausible functions appear after we observe some data. Subsequently we show how inference is performed, and how certain numerical problems that arise can be alleviated by various types of Maximum-a-Posteriori (MAP) estimation.

Once the basic tools are introduced, we analyze the specific properties of Bayesian estimators for three different types of prior probabilities: Gaussian Processes (which includes a description of the theory and efficient means of implementation), which rely on the assumption that adjacent coefficients are correlated, Laplacian Processes, which assume that estimates can be expanded into a sparse linear combination of kernel functions, and therefore favor such hypotheses, and Relevance Vector Machines, which assume that the contribution of each kernel function is governed by a normal distribution with its own variance.

Prerequisites

Elementary Linear Algebra
Calculus
Experience with Bayesian Methods is beneficial, however not required.
Experience with Kernel Methods is likewise beneficial, but not required.

Unit 1: Bayes Rule, Approximate Inference, Hyperparameters (more slides)
Unit 2: Gaussian Processes, Covariance Function, Kernel
Unit 3: GP: Regression
Unit 4: GP: Classification
Unit 5: Implementation: Laplace Approximation, Low Rank Methods
Unit 6: Implementation: Low Rank Methods, Bayes Committee Machine
Unit 7: Relevance Vector Machine: Priors on Coefficients
Unit 8: Relevance Vector Machine: Efficient Optimization and Extensions
Lab

Abstract

Prerequisites

Contents