Bridging the Simulation-to-Reality Gap: A Hybrid Data-Driven Framework for AI-based Prediction of Building Energy Retrofit Performance

15 Sept 2025 (modified: 08 Oct 2025)Submitted to Agents4ScienceEveryoneRevisionsBibTeXCC BY 4.0
Keywords: simulation-to-reality, building energy retrofit, domain adaptation, physics-informed ma- 16 chine learning, conformal prediction, measurement and verification
TL;DR: In order to overcome the crucial simulation-to-reality gap, this research proposes a hybrid, data-driven architecture that enables AI-based prediction of building energy retrofit performance.
Abstract: Predicting realized retrofit performance remains difficult due to a persistent simulation-to-reality (Sim2Real) gap driven by construction and operational uncertainties, sensor biases, and occupant behavior. We propose a hybrid, data-driven framework that trains on large, standardized simulation corpora and calibrates on curated real-world monitoring datasets to quantify and reduce Sim2Real error. The approach augments tabular learners (e.g., XGBoost) with physics-informed features, applies domain-adaptive reweighting to correct distribution shift, and uses post-hoc conformal prediction for calibrated uncertainty. In-domain on iNSPiRe, the model attains $R^2=0.9075$ with $\mathrm{MAE}=\SI{0.027}{\kWh\per\square\metre\per\yr}$; cross-domain on real projects, a plain GBM collapses ($R^2=-2.44$), whereas our hybrid remains \emph{viable} ($R^2=0.10$) and reduces MAE by $\sim$54\% (127.95 $\rightarrow$ 58.25~\si{\kWh\per\month}). We contribute (i) a transparent Sim2Real evaluation protocol for retrofit prediction, (ii) a simple hybrid methodology that restores validity under shift, and (iii) reproducible assets (code, datasets, and experiment cards).
Supplementary Material: zip
Submission Number: 178
Loading