Towards Measuring Predictability: To which extent data-driven approaches can extract deterministic relations from data exemplified with time series prediction and classification

Published: 05 Feb 2025, Last Modified: 05 Feb 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Minimizing loss functions is one important ingredient for machine learning to fit parameters such that the machine learning models extract relations hidden in the data. The smaller the loss function value on various splittings of a dataset, the better the machine learning model is assumed to perform. However, datasets are usually generated by dynamics consisting of deterministic components, where relations are clearly defined and consequently learnable, as well as stochastic parts where outcomes are random and thus not predictable. Depending on the amplitude of the deterministic and stochastic processes, the best achievable loss function value varies and is usually not known in real data science scenarios. In this research, a statistical framework is developed that provides measures to address the predictability of a target given the available input data and, after training a machine learning model, how much of the deterministic relations have been missed by the model. Consequently, the presented framework allows to differentiate model errors into unpredictable parts regarding the given input and a systematic miss of deterministic relations. The work extends the definition of model success or failure as well as the convergence of a training process. Moreover, it is demonstrated how such measures can enrich the procedure of model training. The framework is showcased with time series data on different synthetic and real-world datasets. The code is available at https://github.com/Saleh-Gholam-Zadeh/predictability_measure.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: Outlined in the answer to the AE decision.
Code: https://github.com/Saleh-Gholam-Zadeh/predictability_measure
Supplementary Material: zip
Assigned Action Editor: ~Fredrik_Daniel_Johansson1
Submission Number: 2739
Loading