The area of data science known as "online machine learning"
considers problems where data becomes available sequentially and a
decision must be made at each time based on the information then
available. To find decision algorithms that come with guarantees, such
problems are often considered in an "adversarial" setting. This amounts
to considering a two-person, zero-sum, repeated game involving the
player making decisions (who tries to obtain the best possible outcome)
and his adversary (who chooses the evolution of the data so as to give
the other player the worst possible outcome).
It has only recently been recognized that this topic has strong
connections to partial differential equations. In fact, one is mainly
interested in what happens after many time steps. Therefore it is
natural to consider a scaled version of the problem, in much the same
way that we study the large-time behavior of a random walk by
considering Brownian motion. The dynamic programming principle for the
scaled game is, roughly speaking, a numerical scheme for solving the
My talk will focus on a particular example, known as “Prediction with
Expert Advice”, in which the connection to PDE is now well-understood
and has led to new results. In this example, a "predictor" has access to
guidance from N "experts," whose outcomes are chosen by an "adversary".
At each time step the predictor must combine the experts' guidance using
a mixed strategy, with the goal of minimizing his worst-case final-time
shortfall ("regret") with respect to the best-performing expert. The
relevant PDE determines the predictor’s worst-case regret, and also the
players’ optimal strategies.
The PDE has been solved explicitly in a few special cases involving
small numbers of experts. But the PDE viewpoint also suggests another
approach, namely to seek upper and lower bounds using explicit
supersolutions and subsolutions. The upper bounds obtained this way are
in some sense familiar, as they involve potential-based strategies for
the predictor (using novel, time-dependent potentials). The lower bounds
are less familiar, as they involve potential-based strategies for the