ECE 598MR: Statistical Learning Theory (Fall 2015)

About  |  Schedule  |  References  |  Coursework


The schedule will be updated and revised as the course progresses. Each topic will come with links to reference materials; key references will be highlighted. To get a rough idea of the material, check out the schedules from past offerings: Fall 13, Fall 14.
Tue, Aug 25

Introduction, history, overview, and administrivia.

Thu, Aug 27
Tue, Sep 1

Concentration inequalities: Markov, Chebyshev, McDiarmid (bounded differences inequality), examples

Thu, Sep 3

Formulation of the learning problem: concept and function learning; realizable case; Probably Approximately Correct (PAC) learning.

Tue, Sep 8

Formulation of the learning problem, continued: agnostic (model-free) learning; consistency; Empirical Risk Minimization

Thu, Sep 10
Tue, Sep 15

Empirical Risk Minimization: abstract risk bounds and Rademacher averages -- stochastic inequalities for ERM; Rademacher averages (structural results, Finite Class Lemma); introduction to VC classes

Thu, Sep 17
Vapnik-Chervonenkis classes: shatter coefficients; VC dimension; examples of VC classes; Sauer-Shelah lemma; implication for Rademacher averages
Tue, Sep 22
Thu, Sep 24
Binary classification: bounds for simple VC classes (linear and generalized linear discriminant rules); surrogate loss functions; margin-based bounds
Tue, Sep 29
Thu, Oct 1
No class: Allerton conference

Tue, Oct 6
Thu, Oct 8
Binary classification, continued: reproducing kernel Hilbert spaces and kernel machines; convex risk minimization
Tue, Oct 13
Thu, Oct 15
Regression with quadratic loss
Tue, Oct 20
Thu, Oct 22
Thu, Oct 29
Tue, Nov 3
Thu, Nov 5
Stability of learning algorithms: learnability without uniform convergence; average and uniform stability of learning algorithms; the role of convexity and strong convexity; stability of Stochastic Gradient Descent; connection between differential privacy, stability, and generalization
Tue, Nov 10
Thu, Nov 12
Online learning: basic model; regret; regret bounds for online convex and strongly convex programming via projected gradient descent; online-to-batch conversions; relation to Rademacher averages.
Tue, Nov 17
Thu, Nov 19
Tue, Dec 1
Minimax lower bounds: binary classification under a margin assumption; reduction to finite testing on a binary hypercube (Assouad's lemma); extra log factor for rich VC classes; information-theoretic methods (Fano's inequality)