A model-agnostic ordinal regression pipeline for length of stay prediction

Xiaoxiao Huang, Kaibo He, Chenyu Hou, Min Zhou, Dingchang Zheng

Published: 2025, Last Modified: 06 Jan 2026J. Supercomput. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The prediction of hospitalization duration, known as length of stay (LoS), is a critical aspect of optimizing healthcare resource allocation. To solve this problem, several earlier studies divided LoS into different buckets and predicted them using classification methods. Nonetheless, these studies overlook the skewed distribution and the intrinsic ordinal nature of the various categories. Besides, the highly sparse Electronic Health Records (EHRs) degrade the prediction accuracy. To overcome the aforementioned challenges, in this paper, we propose a model-agnostic ordinal regression pipeline for length of stay prediction (MORE) in ICUs. Initially, we introduce a variable selection module aimed at pruning marginal and sparse features from the original input data. This approach directs the model’s focus toward important features, thereby reducing noise influence and enhancing computational efficiency. Subsequently, we present a multi-task learning-based optimization module where we integrate cross-entropy loss and an accumulated link loss into a unified loss function. Finally, we carry out a comprehensive series of experiments across two publicly available datasets, MIMIC-III and PhysioNet. The experimental results show that MORE can improve the performance of existing classification methods in terms of mean absolute error and accuracy.

External IDs:dblp:journals/tjs/HuangHHZZ25