A Systematic Analysis of Deep Learning Algorithms in High-Dimensional Data Regimes of Limited Size

Simon Jaxy, Ann Nowé, Pieter Libin

Published: 01 Jan 2024, Last Modified: 26 Sept 2025ICTAI 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: There is a substantial demand for deep learning methods that can work with limited, high-dimensional, and noisy datasets. Nonetheless, current research mostly neglects this area, especially in the absence of prior expert knowledge or knowledge transfer. In this work, we bridge this gap by studying the performance of deep learning methods on the true data distribution in a limited, high-dimensional, and noisy data setting. To this end, we conduct a systematic evaluation that reduces the available training data while retaining the challenging properties mentioned above. Furthermore, we extensively search the space of hyperparameters and compare state-of-the-art architectures and models built and trained from scratch to advocate for the use of multi-objective tuning strategies. Our experiments highlight the lack of performative deep learning models in current literature and investigate the impact of training hyperparameters. We analyze the complexity of the models and demonstrate the advantage of choosing models tuned under multi-objective criteria in lower data regimes to reduce the likelihood to overfit. Lastly, we demonstrate the importance of selecting a proper inductive bias given a limited-sized dataset. Given our results, we conclude that tuning models using a multi-objective criterion results in simpler yet competitive models when reducing the number of data points.

External IDs:dblp:conf/ictai/JaxyNL24