Alex Smola – index

ICANN 2001, Vienna, Austria, August 21, 2001. Tutorial Slides, Talk Slides

Alex Smola, RSISE, Machine Learning Group, Australian National University, Canberra

Abstract

Support Vector Machines and related Bayesian kernel methods such as Gaussian Processes or the Relevance Vector Machines have been deployed successfully in classification and regression tasks. They work by mapping the data into a high-dimensional feature space and compute linear functions on the features. This has the appeal of being easily accessible to optimization and theoretical analysis. The algorithmic advantage is that the optimization problems resulting from Support Vector Machines have a global minimum and that they can be solved with standard quadratic programming tools. Furthermore, the parametrization of kernel methods tends to be rather intuitive for the user.

In this tutorial, I will introduce the basic theory of Support Vector Machines and some recent extensions. Moreover, I will present a few simple algorithms to solve the optimization problems in practice.

Outline

Linear Estimators

Discriminant Analysis
Support Vector Classification
Least Mean Squares Regression
Support Vector Regression
Novelty Detection

Kernels

Feature Extraction
Feature Spaces and Kernels
Examples of General-Purpose Kernels
Special Purpose Kernels (Discriminative Models, Texts, Trees, Images)
Kernels and Regularization
Test Criteria for Kernels

Optimization

Newton’s Method
Quadratic Optimizers
Chunking and SMO
Online Methods

Bayesian Methods

Bayesian Basics
A Gaussian Process View
Likelihood, Posterior Probabilities and the MAP approximation
Hyperparameters
Algorithms