## Oliver Club

The area of data science known as "online machine learning"

considers problems where data becomes available sequentially and a

decision must be made at each time based on the information then

available. To find decision algorithms that come with guarantees, such

problems are often considered in an "adversarial" setting. This amounts

to considering a two-person, zero-sum, repeated game involving the

player making decisions (who tries to obtain the best possible outcome)

and his adversary (who chooses the evolution of the data so as to give

the other player the worst possible outcome).

It has only recently been recognized that this topic has strong

connections to partial differential equations. In fact, one is mainly

interested in what happens after many time steps. Therefore it is

natural to consider a scaled version of the problem, in much the same

way that we study the large-time behavior of a random walk by

considering Brownian motion. The dynamic programming principle for the

scaled game is, roughly speaking, a numerical scheme for solving the

relevant PDE.

My talk will focus on a particular example, known as “Prediction with

Expert Advice”, in which the connection to PDE is now well-understood

and has led to new results. In this example, a "predictor" has access to

guidance from N "experts," whose outcomes are chosen by an "adversary".

At each time step the predictor must combine the experts' guidance using

a mixed strategy, with the goal of minimizing his worst-case final-time

shortfall ("regret") with respect to the best-performing expert. The

relevant PDE determines the predictor’s worst-case regret, and also the

players’ optimal strategies.

The PDE has been solved explicitly in a few special cases involving

small numbers of experts. But the PDE viewpoint also suggests another

approach, namely to seek upper and lower bounds using explicit

supersolutions and subsolutions. The upper bounds obtained this way are

in some sense familiar, as they involve potential-based strategies for

the predictor (using novel, time-dependent potentials). The lower bounds

are less familiar, as they involve potential-based strategies for the

adversary.