ECE 555: Lecture Schedule
The schedule will be updated and revised as the course progresses. Links to scribe lectures are on the left.
Markov Decision Processes in discrete time
- Tue Jan 15
- Introduction and administrivia
Control as optimization over time
Stochastic optimization, static case:
- complete and partial observations
- Blackwell's principle of irrelevant information
- Thu Jan 17
[week 1 notes]
- Controlled Markov processes
Policies and their expected cost
Optimality of Markov policies
- Tue Jan 22
- Dynamic programming
- comparison principle
- Bellman's principle of optimality
Example: equipment replacement problem
- Thu Jan 24
[week 2 notes]
- Monotonicity of value functions
- stochastic dominance
- stochastically monotone MDPs
- sufficient condition for monotonicity of value functions
- Tue Jan 29
- Markov decision processes with general state and action spaces
- review: Kolmogorov axioms of probability
- measurability, random variables, Borel sigma-fields, Borel spaces
- MDPs with Borel state and action spaces
- optimality of Markov policies in the general case
- Thu Jan 31
[week 3 notes]
- The finite-horizon Linear Quadratic Regulator (LQR) problem
- dynamic programming recursion
- completing-the-squares lemma
- structural properties: policies are linear, value functions are quadratic and convex
- Tue Feb 5
- MDPs with partial observations: the finite case
- problem formulation and goals
- prelude: hidden Markov models
- nonlinear filter recursion and Markov property
- Thu Feb 7
[week 4 notes]
- MDPs with partial observations, continued
- definition of admissible policies and expected cost
- state space augmentation: belief state
- belief state process as a controlled Markov process
- Tue Feb 12
- MDPs with partial observations, continued
- reduction to the fully observed case via belief state
- dynamic programming in belief space
- example: two-armed bandit
- Thu Feb 14
[week 5 notes]
- The two-armed bandit problem
- state-space formulation
- derivation of the belief state evolution
- optimality of the index policy
- Tue Feb 19
- Stochastic control over an infinite horizon
- infinite-horizon optimality criteria: discounted cost, average cost
- discounted cost
- Discounted-Cost Optimality Equation
- contraction property of the dynamic programming operator
- existence and uniqueness of the value function, value iteration
- Thu Feb 21
[week 6 notes]
- Discounted cost, continued
- policy iteration
- structure of discount-optimal policies
- Tue Feb 26
- Average-cost optimality criterion
- canonical triplets
- Average-Cost Optimality Equation
- relative value function as terminal cost for a finite-horizon problem
- Thu Feb 28
Tue Mar 5
- Linear Quadratic Regulator under the average-cost criterion
- quadratic ansatz for the relative value function, discrete algebraic Riccati equation
- conditions for existence and uniqueness of the fixed point of the DARE
- existence of stabilizing optimal controller
- Thu Mar 7
[week 8 notes]
- Linear-Quadratic-Gaussian (LQG) problem with partial observations
- problem formulation
- LQ HMM: derivation of the Kalman filter
- controllability and observability conditions for existence and uniqueness of the steady-state Kalman filter
- Tue Mar 12
- Review
- Thu Mar 14
- In-Class Midterm
- Tue Mar 19
Thu Mar 21
- SPRING BREAK
- Tue Mar 26
Thu Mar 28
[week 10 notes]
- Partially observed LQG
- review: fully observed LQR, the Kalman filter
- state estimation in the presence of control: information equivalence, linear dynamics of the information state
- partially observed LQG problem, finite-horizon and infinite horizon
- separation between estimation and control, the dual effect
Markov Decision Processes in continuous time
- Tue Apr 2
Thu Apr 4
[week 11 notes]
- Continuous-time Markov processes
- abstract definition, transition kernels, Chapman-Kolmogorov equation, time-homogeneous processes
- finite-state continuous-time Markov processes
- pure-jump processes, transition intensities and Q-matrices
- forward Kolmogorov equation
- a construction using Poisson processes: uniformization
- diffusion processes
- definition: local drift coefficient and diffusion matrix
- infinitesimal generator: second-order diffusion operator
- a construction using an Euler scheme and Brownian motion, conditions for well-posedness of the forward Kolmogorov equation
- Tue Apr 9
Thu Apr 11
[week 12 notes]
- Ito calculus and controlled diffusion processes
- Filtrations, adapted processes, stochastic integral in the sense of K. Ito
- Ito's lemma
- Controlled diffusion processes: expected cost, cost-to-go, value function
- Necessary condition for optimality, the Hamilton-Jacobi-Bellman equation
- Tue Apr 16
Thu Apr 18
[week 13 notes]
- Markov decision processes in continuous time: examples
- The linear quadratic regulator in continuous time: HJB equation and Riccati equation, optimali policy, value function
- Controlled continuous-time finite-state Markov processes
- Tue Apr 23
Mon Apr 29
[week 14 notes]
- Nonlinear filtering in continuous time
- The linear quadratic Gaussian problem in continous time: the Kalman-Bucy filter
- General nonlinear filtering in continuous time
- the reference measure approach: Girsanov change of measure, the Kallianpur-Striebel formula
- evolution of the nonlinear filter: unnormalized (Duncan-Mortensen-Zakai equation) and normalized (Kushner-Stratonovich equation)
- finite-state signal: the Wonham filter