Keywords: semi-supervised regression; mixture of experts; regime discovery; clustering-gated routing; tabular learning; uncertainty estimation; pseudo-label risk control; consistency regularization; conditional shift; industrial analytics
Abstract: We study regime-aware semi-supervised regression for tunnel boring machine (TBM) operation modeling under cross-strata nonstationarity and label scarcity. We propose \textbf{CGE}---\emph{Clustering-Gated Experts}---a three-stage framework that (i) discovers latent geological regimes via robust ensemble clustering in a compact descriptor space; (ii) trains per-regime heterogeneous ensembles with agreement-based pseudo-labeling and consistency regularization; and (iii) routes predictions through a lightweight distance-based soft gate. For risk-aware deployment, we equip all predictors with conformalized quantile regression (CQR) to produce calibrated prediction intervals. On real TBM data with 5--20\% label budgets, \textbf{CGE} surpasses strong semi-supervised baselines; at 10\% labels it reaches an average coefficient of determination (R\textsuperscript{2}) of 0.94 and root-mean-squared error (RMSE) of 0.11. With 90\% CQR prediction intervals, it attains near-nominal coverage together with narrow interval widths and lower negative log-likelihood and continuous ranked probability score (CRPS). Overall, \textbf{CGE} offers a practical accuracy--uncertainty trade-off for safety-critical TBM decision-making under nonstationary geology.
Supplementary Material: zip
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 15576
Loading