### ENGN 4520: Introduction to Machine Learning

#### Overview

Machine learning is an exciting new subject dealing with automatic recognition of patterns (e.g. automatic recognition of faces, postal codes on envelopes, speech recognition), prediction (e.g. predicting the stock market), datamining (e.g. finding good customers, fraudulent transactions), applications in bioinformatics (finding relevant genes, annotating the genome, processing DNA Microarray images), or internet-related problems (spam filtering, searching, sorting, network security). It is becoming a key area for technical advance and people skilled in this area are worldwide in high demand.Some companies using machine learning:

This unit introduces the fundamentals of machine learning, based on linear and kernel classifiers. The course requires mathematical and computer skills. It will cover linear algebra and numerical mathematics techniques, the Perceptron, Online Learning, Regression methods, Kernels and Regularization, Large Margin Methods such as Support Vector Machines, and applications such as Text Categorization or Database Cleaning.

#### Prerequisites

#### Details

**Format:**Second half of the first semester, from April 30, 2001 until June 8, 2001. The course consists of

**three lectures**plus

**two hours tutorial and lab**per week. Exercises and programming problems are handed out on a weekly basis.

**Marks:** This is a 3 credit point unit. The final exam counts 50%,
programming is 25% and the exercises 25%. You are encouraged to solve the
exercises and programming problems in teams of three (no copying between
different teams though).

**Lecture times:** Lecture Monday 2-4pm in GEOL T and Tuesday 1-2pm in GEOL T,
Tutorial Thursday 2-3pm in ENGN T and 3-5pm in ENGN G1

#### Contents

**Week 1:**Linear Algebra, Hilbert Spaces, Numerical Mathematics (Lecture 1, Lecture 2, Lecture 3, Problem Sheet)

**Week 2:**Problems in Learning Theory, Statistics and Probability, Risk Functional, Common Distributions, Perceptron (Lecture 4, Lecture 5, Lecture 6, Problem Sheet)

**Week 3:**Regression, Squared Loss, Noise Models and Loss, Regularization, Bayesian Inference (Lecture 7, Lecture 8, Lecture 9, Problem Sheet)

**Week 4:**Kernels, Kernel Perceptron, Kernel Regression (Lecture 10, Lecture 11, Lecture 12, Problem Sheet), SVLab Lite source

**Week 5:**Large Margin and Optimization Methods, SV Classification, SV Regression, Novelty Detection (Lecture 13, Lecture 14, Lecture 15, Problem Sheet),

**Week 6:**Applications and Mini Projects: Text Categorization (on Reuters data), Bad Digits (on USPS)

**Handouts:**1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15

**Exam:**Problems, or Problems with Solutions are online now.