Introduction
Syllabus
Administrative details
Programming with data
- Overview of machine learning problems
- Supervised problems (regression, classification, sequence annotation)
- Unsupervised problems (clustering, topics, subspaces)
Problem Settings
- Nonresponsive environment (induction, transduction, covariate shift)
- Responsive environment (batch, online, active learning, bandits, reinforcement learning)
- Discriminative vs. generative models
Data
- Internet (traffic, user generated content, activity)
- Medicine (hospitals, healthcare, sequencing)
- Physics
Basic tools
- Nearest neighbors
- Linear regression
Background Material
- Introduction to Machine Learning handout
- UCI Machine Learning Repository. This is the original repository for a very diverse set of papers.
- Machine Learning Summer School 2014 in Pittsburgh. Many overview courses, tutorials, slides, etc.
- ImageNet dataset. This is what many researchers use for computer vision.
- MNIST dataset of handwritten digits.
Videos