Tutorials
- Mikhail Belkin
(Halicioglu Data Science Institute and Computer Science and Engineering Department at UCSD and Amazon Scholar)
Theory of Deep Learning - Siva Theja Maguluri
(H. Milton Stewart School of Industrial & Systems Engineering at Georgia Tech.)
Reinforcement Learning and Stochastic Approximation - Prabha Mandayam
(Department of Physics, Indian Institute of Technology Madras)
Quantum Information Theory and Quantum Error Correction
Mikhail Belkin
Theory of Deep Learning
Generalization is the central topic of machine learning and data science. What patterns can be learned from observations and how can we be sure that they extend to future, not yet seen, data? I will try to outline the arc of recent developments in understanding of generalization in machine learning. These changes occurred largely due to empirical findings in neural networks which necessitated revisiting theoretical foundations of generalizations. Classically, many analyses relied on the assumption that the training loss was an accurate proxy of the test loss. This turned out to be unfounded as good practical predictors frequently have training loss that is much lower than the test loss. Theoretical developments, such as analyses if interpolation and double descent have recently shed light on that issue. In view of that, a common practical prescription has become to mostly ignore the training loss and to adopt early stopping – to stop the model training once the validation loss plateaus. Recent discovery of emergent phenomena like grokking show that this practice is also not generally justifiable as at least in some settings, the test loss up to a certain iteration may not be predictive of the test loss itself just a few iterations later. I will discuss why this presents a fundamental challenge to both theory and practice of machine learning, and attempt to describe the current state of affairs.
Mikhail Belkin is a Professor at Halicioglu Data
Science Institute and Computer Science and Engineering Department at
UCSD and an Amazon Scholar. Prior to that he was a Professor at the
Department of Computer Science and Engineering and the Department of
Statistics at the Ohio State University. He received his Ph.D. from
the Department of Mathematics at the University of Chicago (advised by
Partha Niyogi). His research interests are broadly in theory and
applications of machine learning, deep learning and data
analysis. Some of his well-known work includes widely used Laplacian
Eigenmaps, Graph Regularization and Manifold Regularization
algorithms, which brought ideas from classical differential geometry
and spectral graph theory to data science. His more recent work has
been concerned with understanding remarkable mathematical and
statistical phenomena observed in deep learning. The empirical
evidence necessitated revisiting some of the classical concepts in
statistics and optimization, including the basic notion of
over-fitting. One of his key findings has been the “double descent”
risk curve that extends the textbook U-shaped bias-variance trade-off
curve beyond the point of interpolation. His recent work focusses on
understanding feature learning and over-parameterization in deep
learning. Mikhail Belkin is an ACM Fellow and a recipient of a NSF
Career Award and a number of best paper and other awards. He had
served on the editorial boards of IEEE Proceedings on Pattern Analysis
Machine Intelligence and the Journal of the Machine Learning
Research. He is the editor-in-chief of SIAM Journal on Mathematics of
Data Science (SIMODS).
Siva Theja Maguluri
Reinforcement Learning and Stochastic Approximation
We consider the problem of solving fixed-point equations of the form $F(x) = x$, where $F(\cdot)$ is an operator that is a contraction with respect to some norm (e.g., the $\ell^\infty\text{-norm}$). Stochastic approximation (SA) is an algorithm that was first proposed by Robins and Monro to achieve this when only noisy samples of the operator were accessable. SA is at the heart of several machine learning algorithms today. The focus of this tutorial is on characterizing the rate of convergence of SA and to illustrate its use in applications such as reinforcement learning (RL).
We first present finite-time bounds on the mean-square convergence error and show how it translates to the case of a few RL algorithms. The key in obtaining our mean-square bounds is the construction of a smooth Lyapunov/potential function based on infimal convolution and generalized Moreau envelope, that enables us to control the stochastic errors in the SA algorithm. We will also present an overview of several generalizations of SA that can be obtained using this type of Lyapunov argument, such as SA under seminorm contractive operators. Finally, we present high probability bounds on the error tail, which are also obtained using exponential supermartingales in conjunction with the Moreau envelop and a novel bootstrapping approach.
Prof. Siva Theja Maguluri is the Fouts Family Early Career Professor
and an Associate Professor in the H. Milton Stewart School of
Industrial & Systems Engineering at Georgia Tech. Before joining
Georgia Tech, Prof. Maguluri spent two years in the Stochastic
Processes and Optimization group within the Mathematical Sciences
Department at the IBM T. J. Watson Research Center. He completed a
Ph.D. in Electrical and Computer Engineering (ECE) from the University
of Illinois at Urbana-Champaign (UIUC) in 2014, under the supervision
of Prof. R. Srikant, following an MS in ECE also advised by
Prof. Srikant and Prof. Bruce Hajek. Prof. Maguluri additionally holds
an MS in Applied Mathematics from UIUC and a B.Tech in Electrical
Engineering from the Indian Institute of Technology Madras.
Prof. Maguluri has received several notable awards, including the NSF
CAREER award in 2021, the 2017 Best Publication in Applied Probability
Award from INFORMS Applied Probability Society, and second prize in
the 2020 INFORMS Junior Faculty Interest Group (JFIG) Best Paper
Competition. Joint research with his students received the Stephen
S. Lavenberg Best Student Paper Award at IFIP
Performance 2021. Recognized for teaching excellence, Prof. Maguluri
received the Student Recognition of Excellence in Teaching: Class of
1934 CIOS Award in 2020 for ISyE 6761 and the CTL/BP Junior Faculty
Teaching Excellence Award in 2020, presented by the Center for
Teaching and Learning at Georgia Tech.
Prabha Mandayam
Quantum Information Theory and Quantum Error Correction
This set of lectures aims to give a flavour of quantum information theory and quantum error correction codes. Details below:
- Lecture 1: Quantum states and gates, Qubit and Bloch sphere, pure and mixed states, quantum measurements: projects and positive operator valued measures (POVMs), Distinguishing quantum states.
- Lecture 2: The No-cloning Theorem, Entanglement and Bell states, Teleportation and Superdense Coding. Von Neumann Entropy and basic properties: subadditivity and negative conditional entropy.
- Lecture 3: Quantum Channels: completely positive trace-preserving (CPTP) maps, Choi theorem and Kraus representation. Examples: quantum bit-flip and phase-flip channels, depolarising channel, quantum erasure channel and amplitude-damping channel.
- Lecture 4: Quantum Error Correction: Quantum repetition code, Shor’s 9-qubit code, Stabilizer codes.
- Lecture 5: Quantum Information Processing: The Holevo bound (proof sketch), brief discussion on classical and quantum capacities of quantum channels.
Prof. Prabha Mandayam is an Associate Professor in the Department of
Physics at the Indian Institute of Technology, Madras. Previously, she
was an Inspire Faculty Fellow at the Chennai Mathematical Institute
and a Postdoctoral Fellow with the Optics and Quantum Information
Group at the Institute of Mathematical Sciences. She obtained her
Ph.D. in Physics from the Institute for Quantum Information and Matter
at Caltech under the supervision of John Preskill. Prof. Mandayam’s
research interests lie in the field of quantum computing and quantum
information theory. Specifically, she focuses on quantum error
correction, the interplay between quantum foundations and quantum
cryptography, and using quantum information as a tool to explore
fundamental questions in theoretical physics.