Abstract: Deep learning models benefit from training on a vast collection of samples. However, training on more samples is not necessarily equivalent to higher performance in terms of accuracy and speed. Learning with organized training materials, by emphasizing on easy or hard examples, is shown to achieve better performance in certain scenarios. While there is no standard strategy for easiness evaluation, a loss metric is frequently used as an indicator. In this work, we propose a unified sample easiness estimator to quantify the level of easiness of a model on a sample. We further propose a novel loss function, named Sample Easiness-based Loss (SEL), which regularizes class probabilities to be better used in sample easiness estimates. SEL can be easily applied to any neural network architecture without any modification. We then provide a novel neural network training strategy, sample easiness-based training (SET), to offer a choice of training with designated sample easiness, e.g., medium easiness, to reduce the training time significantly. Results show that our SET utilizes only 0.06%--11.18% of training samples while achieving similar or higher test accuracies. In addition, we demonstrate that our sample easiness framework is helpful in mislabeled data identification task.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Gang_Niu1
Submission Number: 338
Loading