Abstract: Machine learning (ML) research has investigated prototypes: examples that are representative of the behavior to be learned. We systematically evaluate five methods for identifying prototypes, both ones previously introduced as well as new ones we propose, finding all of them to provide meaningful but different interpretations. Through a human study, we confirm that all five metrics are well matched to human intuition. Examining cases where the metrics disagree offers an informative perspective on the properties of data and algorithms used in learning, with implications for data-corpus construction, efficiency, adversarial robustness, interpretability, and other ML aspects. In particular, we confirm that the "train on hard" curriculum approach can improve accuracy on many datasets and tasks, but that it is strictly worse when there are many mislabeled or ambiguous examples.
Keywords: prototypes, curriculum learning, interpretability, differential privacy, adversarial robustness
TL;DR: We can identify prototypical and outlier examples in machine learning that are quantifiably very different, and make use of them to improve many aspects of neural networks.
Data: [CIFAR-10](https://paperswithcode.com/dataset/cifar-10), [Fashion-MNIST](https://paperswithcode.com/dataset/fashion-mnist)
18 Replies
Loading