CLEVA-Compass: A Continual Learning Evaluation Assessment Compass to Promote Research Transparency and Comparability
Keywords: continual learning, lifelong learning, machine learning evaluation
Abstract: What is the state of the art in continual machine learning? Although a natural question for predominant static benchmarks, the notion to train systems in a lifelong manner entails a plethora of additional challenges with respect to set-up and evaluation. The latter have recently sparked a growing amount of critiques on prominent algorithm-centric perspectives and evaluation protocols being too narrow, resulting in several attempts at constructing guidelines in favor of specific desiderata or arguing against the validity of prevalent assumptions. In this work, we depart from this mindset and argue that the goal of a precise formulation of desiderata is an ill-posed one, as diverse applications may always warrant distinct scenarios. Instead, we introduce the Continual Learning EValuation Assessment Compass: the CLEVA-Compass. The compass provides the visual means to both identify how approaches are practically reported and how works can simultaneously be contextualized in the broader literature landscape. In addition to promoting compact specification in the spirit of recent replication trends, it thus provides an intuitive chart to understand the priorities of individual systems, where they resemble each other, and what elements are missing towards a fair comparison.
One-sentence Summary: We introduce the Continual Learning EValuation Assessment Compass, which provides the visual means to both identify how approaches are practically reported and how they can simultaneously be contextualized in the broader literature landscape.
Supplementary Material: zip