The schedule will be updated and revised as the course progresses. Links to scribe lectures are on the left.

- Tue Jan 15
- Introduction and administrivia

Control as optimization over time

Stochastic optimization, static case:- complete and partial observations
- Blackwell's principle of irrelevant information

- Thu Jan 17

[week 1 notes] - Controlled Markov processes

Policies and their expected cost

Optimality of Markov policies - Tue Jan 22
- Dynamic programming
- comparison principle
- Bellman's principle of optimality

- Thu Jan 24

[week 2 notes] - Monotonicity of value functions
- stochastic dominance
- stochastically monotone MDPs
- sufficient condition for monotonicity of value functions

- Tue Jan 29
- Markov decision processes with general state and action spaces
- review: Kolmogorov axioms of probability
- measurability, random variables, Borel sigma-fields, Borel spaces
- MDPs with Borel state and action spaces
- optimality of Markov policies in the general case

- Thu Jan 31

[week 3 notes] - The finite-horizon Linear Quadratic Regulator (LQR) problem
- dynamic programming recursion
- completing-the-squares lemma
- structural properties: policies are linear, value functions are quadratic and convex

- Tue Feb 5
- MDPs with partial observations: the finite case
- problem formulation and goals
- prelude: hidden Markov models
- nonlinear filter recursion and Markov property

- Thu Feb 7

[week 4 notes] - MDPs with partial observations, continued
- definition of admissible policies and expected cost
- state space augmentation: belief state
- belief state process as a controlled Markov process

- Tue Feb 12
- MDPs with partial observations, continued
- reduction to the fully observed case via belief state
- dynamic programming in belief space
- example: two-armed bandit

- Thu Feb 14

[week 5 notes] - The two-armed bandit problem
- state-space formulation
- derivation of the belief state evolution
- optimality of the index policy

- Tue Feb 19
- Stochastic control over an infinite horizon
- infinite-horizon optimality criteria: discounted cost, average cost
- discounted cost
- Discounted-Cost Optimality Equation
- contraction property of the dynamic programming operator
- existence and uniqueness of the value function, value iteration

- Thu Feb 21

[week 6 notes] - Discounted cost, continued
- policy iteration
- structure of discount-optimal policies

- Tue Feb 26
- Average-cost optimality criterion
- canonical triplets
- Average-Cost Optimality Equation
- relative value function as terminal cost for a finite-horizon problem

- Thu Feb 28

Tue Mar 5 - Linear Quadratic Regulator under the average-cost criterion
- quadratic ansatz for the relative value function, discrete algebraic Riccati equation
- conditions for existence and uniqueness of the fixed point of the DARE
- existence of stabilizing optimal controller

- Thu Mar 7

[week 8 notes] - Linear-Quadratic-Gaussian (LQG) problem with partial observations
- problem formulation
- LQ HMM: derivation of the Kalman filter
- controllability and observability conditions for existence and uniqueness of the steady-state Kalman filter

- Tue Mar 12
- Review
- Thu Mar 14
**In-Class Midterm**- Tue Mar 19

Thu Mar 21 **SPRING BREAK**- Tue Mar 26

Thu Mar 28

[week 10 notes] - Partially observed LQG
- review: fully observed LQR, the Kalman filter
- state estimation in the presence of control: information equivalence, linear dynamics of the information state
- partially observed LQG problem, finite-horizon and infinite horizon
- separation between estimation and control, the dual effect

- Tue Apr 2

Thu Apr 4

[week 11 notes] - Continuous-time Markov processes
- abstract definition, transition kernels, Chapman-Kolmogorov equation, time-homogeneous processes
- finite-state continuous-time Markov processes
- pure-jump processes, transition intensities and Q-matrices
- forward Kolmogorov equation
- a construction using Poisson processes: uniformization
- diffusion processes
- definition: local drift coefficient and diffusion matrix
- infinitesimal generator: second-order diffusion operator
- a construction using an Euler scheme and Brownian motion, conditions for well-posedness of the forward Kolmogorov equation

- Tue Apr 9

Thu Apr 11

[week 12 notes] - Ito calculus and controlled diffusion processes
- Filtrations, adapted processes, stochastic integral in the sense of K. Ito
- Ito's lemma
- Controlled diffusion processes: expected cost, cost-to-go, value function
- Necessary condition for optimality, the Hamilton-Jacobi-Bellman equation

- Tue Apr 16

Thu Apr 18

[week 13 notes] - Markov decision processes in continuous time: examples
- The linear quadratic regulator in continuous time: HJB equation and Riccati equation, optimali policy, value function
- Controlled continuous-time finite-state Markov processes

- Tue Apr 23

Mon Apr 29

[week 14 notes] - Nonlinear filtering in continuous time
- The linear quadratic Gaussian problem in continous time: the Kalman-Bucy filter
- General nonlinear filtering in continuous time
- the reference measure approach: Girsanov change of measure, the Kallianpur-Striebel formula
- evolution of the nonlinear filter: unnormalized (Duncan-Mortensen-Zakai equation) and normalized (Kushner-Stratonovich equation)
- finite-state signal: the Wonham filter