Keywords: algorithm comparison, model comparison, data-centric, influence, datamodels, pre-training, spurious correlation, distribution shift
TL;DR: A unified framework for comparing models trained with two different learning algorithms based on how models use training data to make predictions
Abstract: Understanding model biases is crucial to understanding how models will perform out-of-distribution (OOD). These biases often stem from particular design choices (e.g., architecture or data augmentation). We propose a framework for (learning) algorithm comparisons, wherein the goal is to find similarities and differences between models trained with two different learning algorithms. We begin by formalizing the goal of algorithm comparison as finding distinguishing feature transformations, input transformations that change the predictions of models trained with one learning algorithm but not the other. We then present a two-stage method for algorithm comparisons based on comparing how models use the training data, leveraging the recently proposed datamodel representations [IPE+22]. We demonstrate our framework through a case study comparing classifiers trained on the Waterbirds [SKH+20] dataset with/without ImageNet pre-training.