Deep Networks
Content
Overview
- CPU performance, data, and the sweet spot for algorithms
- Perceptron and going nonlinear (wide or deep)
- Backpropagation
Layers
- Fully connected
- Convolutions
- Invariances for images
- Whole system training
Objective functions
- Classification and SoftMax
- Regression
- Autoencoder
- Contrastive Estimation
- Invariances
Optimization
- Stochastic Gradient Descent
- Learning rates, AdaGrad, Minibatches
- Momentum
- Dropout and DropConnect for regularization
Memory
- Recurrent Networks
- Hidden Markov Models
- Long Short Term Memory
- Memory Networks
Supplementary material
- Nair and Hinton, 2010, Rectified Linear Unites
- Simonyan and Zisserman, 2014, Narrow and deep beats wide and shallow
- Szegedy et al., 2014 Inception Layer in GoogLeNet
- Le Cun, Bottou, Bengio, Haffner, 2001 Whole system training
- Grefenstette et al, 2014 Autoencoder between domains
- Senior, Heigold, Ranzato and Yang, 2013 Learning Rate Comparison
- Duchi, Hazan, Singer, 2010 AdaGrad
- Srivastava, Hinton, Krizhevski, Sutskever, Salakhutdinov, 2015 Dropout
- Graves, 2013 LSTM Tutorial
- Graves, Wayne, Danihelka, 2014, Neural Turing Machine
- Weston, Chopra, Bordes, 2014, Memory Networks
Video