Online Learning in Periodic Zero-Sum Games

Alternating Gradient Descent-Ascent

Example with time average game = 0

With the typical Matching Pennies payoff matrix, time average of the payoff matrix A(t) = 0. This results in gradient descent trajectories that do not converge in time average to 0.

We rescale the off diagonal elements of the payoff matrices. We see that the gradient descent trajectories converge in time average to 0.

Time Averages

Matching Pennies game with periodic rescaling

Rescaled off diagonal Matching Pennies

Piecewise function counterexample

Replicator Dynamics - Matching Pennies

Time Averages

Large Scale Simulation

Create time invariant plot

Note that we reduce the number of simulation steps from 50000 to 500 to reduce the data size. Full sized plot can be found in the paper

Use a sparser graph without randomized periodic functions, to reduce number of iterations needed to achieve recurrence.

First 600 iterations of the simplified polymatrix simulation. Simulation takes about 6226 iterations to recur.