Hard Sample Mining: A New Paradigm of Efficient and Robust Model Training

Lei Liu, Yunji Liang, Xiaokai Yan, Luwen Huangfu, Sagar Samtani, Zhiwen Yu, Yanyong Zhang, Daniel D. Zeng

Published: 01 Jan 2025, Last Modified: 07 Jan 2026IEEE Transactions on Neural Networks and Learning SystemsEveryoneRevisionsCC BY-SA 4.0
Abstract: Over the past two decades, deep learning (DL) has achieved unprecedented breakthroughs across diverse application domains spanning computer vision (CV) to natural language processing (NLP). However, despite significant advances in computational resources and algorithmic frameworks, the training of deep neural networks continues to present formidable challenges due to persistent issues of training inefficiency and inherent data distribution biases. Recent years have witnessed the emergence of hard sample mining (HSM) as a promising paradigm to mitigate training inefficiencies and enhance model robustness through representative sample selection. Although HSM is reshaping contemporary AI research, its critical role in enabling efficient and robust model training has not yet been systematically explored. This article presents a comprehensive survey of HSM methodologies by: 1) establishing unified definitions of hard samples through rigorous sample complexity quantification criteria; 2) proposing a systematic taxonomy of HSM approaches with in-depth technical analysis; and 3) identifying pivotal research frontiers in this evolving field. This survey not only consolidates the foundations of HSM but also provides a roadmap for advancing efficient, robust, and generalizable deep learning models.
Loading