Syllabus

Overview

Scalable Machine Learning occurs when Statistics, Systems, Machine Learning and Data Mining are combined into flexible, often nonparametric, and scalable techniques for analyzing large amounts of data at internet scale. This class aims to teach methods which are going to power the next generation of internet applications.

The class will cover systems and processing paradigms, an introduction to statistical analysis, algorithms for data streams, generalized linear methods (logistic models, support vector machines, etc.), large scale convex optimization, kernels, graphical models and inference algorithms such as sampling and variational approximations, and explore/exploit mechanisms. Applications include social recommender systems, real time analytics, spam filtering, topic models, and document analysis.

Lectures

Syllabus

1. Systems

2. Basic Statistics

3. Data Sketches and Streams

4. Optimization

5. Generalized Linear Models

6. Kernels and Regularization

7. Recommender Systems

8. Midterm Project Presentations

Maximum team size is 4, and a typical team should have 3 members. Each team gets to pitch their project to the class for 10 minutes and hand in a written documentation of at least 4 and at most 10 pages of a reasonable font size. You should be able to address the following criteria (adapted from Heilmeier’s criteria for the purpose of this class). This type of reasoning will help you with choosing your own research agenda, writing grants, convincing colleagues, securing VC funding, and writing papers.

9. Graphical Models

10. Latent Variable Model Templates

11. Structured Estimation

12. Large Scale Inference in Graphical Models

13. Applications

14. Explore Exploit

15. Final Project Presentations

Each team gets to give a final presentation of their project to the class. This may be as a traditional talk, a demo, a product, an app, or any combination thereof. Make sure you discuss what you’re doing, why you’re doing it, in which way it is different or better than what’s available, and what it is good for.