EE 6106 - Online Learning and Optimisation

EE 6106 – Online Learning and Optimisation

Home » EE 6106 - Online Learning and Optimisation

EE 6106 – Online Learning and Optimisation

The course will primarily be on online learning in a stochastic environment. The emphasis will be on proving formal performance guarantees of algorithms, and also fundamental limits on the performance of any algorithm.

The first half of the course will focus on variants of the multi-armed bandit problem:

Regret minimization: algorithms and information theoretic lower bounds
Pure exploration: fixed budget and fixed confidence
Linear bandits, contextual bandits, Bayesian bandits (Thompson sampling)

The second half of the course will focus on learning in the context of Markov decision processes (MDPs):

Background on MDPs
Markovian bandits (rested and restless); Gittins and Whittle index
Reinforcement learning

Evaluation will be based on home works, exam and a research project.

Text/References

T. Lattimore and C. Szepesvári, “Bandit Algorithms,” Available at http://tor-lattimore.com/downloads/book/book.pdf.
Alexandre Slivkins, “Introduction to Multi-Armed Bandits,” NOW Publishers, 2019
D. Russo, et al, “A Tutorial on Thompson Sampling,” NOW Publishers, 2018.
S. Shalev-Schwartz, “Online Learning and Online Convex Optimization,” NOW Publishers 2011.
Contemporary research papers

Latest Semester

–

Programs

–

Latest Instructor

Substitutions

–

News

04 April, 2025

Details Regarding M.Tech. Admission 2025-26

The Electrical Engineering...

07 March, 2025

Updates for Ph.D Admissions (Autumn Semester 2025-26)

The Electrical Engineering...

Events

No Future Events