Abstract: Consider a predictor, a learner, whose input is a stream of discrete
items. The predictor's task, at every time point, is {\em
probabilistic multiclass prediction}, \ie to predict
which item may
occur next by outputting zero or more candidate items, each with a
probability, after which the actual item is revealed and the predictor
learns from this observation. To output probabilities, the predictor
keeps track of the proportions of the items it has seen. The stream
is unbounded and the predictor has finite limited space and we seek
efficient prediction and update techniques: the set of items is
unknown to the predictor and their totality can also grow unbounded.
Moreover, there is {\em non-stationarity}: the underlying frequencies
of items may change, substantially, from time to time. For instance,
new items may start appearing and a few recently frequent items may
cease to occur again. The predictor, being space-bounded, need only
provide probabilities for those items with (currently) {\em
sufficiently high} frequency, \ie the {\em salient} items. This
problem is motivated in the setting of {\em prediction games}, a
self-supervised learning regime where concepts serve as {\em both the
predictors and the predictands}, and the set of concepts grows over
time, resulting in non-stationarities as new concepts are generated
and used.
We develop sparse multiclass moving average techniques designed to
respond to such non-stationarities in a timely manner. One technique
is based on the exponentiated moving average (EMA) and another is
based on queuing a few count snapshots. We show that the combination,
and in particular supporting dynamic predictand-specific learning
rates, offers advantages in terms of faster change detection and
convergence.
Loading