A Log-Linear Analytics Approach to Cost Model Regularization for Inpatient Stays through Diagnostic Code Merging
Abstract: Healthcare cost models that use a great number of detailed ICD-10 diagnostic codes produce unstable results, yet the underlying causes of this instability have not been well understood. This study provides a mathematical framework linking the variability of model coefficients to the uneven, power-law distribution of diagnostic codes and the structure of the regression model. We propose a transparent approach that improves coefficient stability by merging similar codes through hierarchical truncation. Using Medicare data, we demonstrate how this method clarifies the trade-off between code detail and model reliability, offering analysts and policymakers a practical and interpretable tool for diagnosis-based cost modeling.
Loading