Abstract: We introduce and study minimax curriculum learning (MCL), a new method for adaptively selecting a sequence of training subsets for a succession of stages in machine learning. The subsets are encouraged to be small and diverse early on, and then larger, harder, and allowably more homogeneous in later stages. At each stage, model weights and training sets are chosen by solving a joint continuous-discrete minimax optimization, whose objective is composed of a continuous loss (reflecting training set hardness) and a discrete submodular promoter of diversity for the chosen subset. MCL repeatedly solves a sequence of such optimizations with a schedule of increasing training set size and decreasing pressure on diversity encouragement. We reduce MCL to the minimization of a surrogate function handled by submodular maximization and continuous gradient methods. We show that MCL achieves better performance and, with a clustering trick, uses fewer labeled samples for both shallow and deep models while achieving the same performance. Our method involves repeatedly solving constrained submodular maximization of an only slowly varying function on the same ground set. Therefore, we develop a heuristic method that utilizes the previous submodular maximization solution as a warm start for the current submodular maximization process to reduce computation while still yielding a guarantee.
TL;DR: Minimax Curriculum Learning is a machine teaching method involving increasing desirable hardness and scheduled reducing diversity.
Keywords: machine teaching, deep learning, minimax, curriculum learning, submodular, diversity
Data: [CIFAR-10](https://paperswithcode.com/dataset/cifar-10), [Fashion-MNIST](https://paperswithcode.com/dataset/fashion-mnist), [MNIST](https://paperswithcode.com/dataset/mnist), [STL-10](https://paperswithcode.com/dataset/stl-10), [SVHN](https://paperswithcode.com/dataset/svhn)