Beyond Classification: A Cough Regression Benchmark for Respiratory Acoustic Foundation Models

Published: 23 May 2026, Last Modified: 23 May 2026SD4H ICML 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Respiratory acoustics, Foundation models, Cough regression, Benchmark, Cross-dataset generalisation
Abstract: Respiratory acoustic foundation models (FMs) excel at cough classification, yet their ability to predict continuous health quantities from cough audio remains largely unexplored, despite the clinical value of passive age, BMI, and disease probability estimation in settings where physical measurements are unavailable. We introduce the multi-model, multi-target cough regression benchmark evaluating five FMs (OPERA-CT, OPERA-CE, OPERA-GT, HEAR, M2D+RESP) across six targets on three datasets under subject-disjoint protocols, comparing linear, MLP-small, and full MLP regression heads. MLP-small beats the mean-predictor baseline on all tasks and linear probing in 23 of 30 model × task cases, with full MLP overfitting on small clinical data but recovering on larger sets, revealing a dataset size × head-capacity trade-off. HEAR leads within-dataset age regression (CIDRZ: 10.29 yr, Coswara: 9.12 yr MAE), OPERA-GT consistently outperforms OPERA-CT on age regression across all three datasets extending a generative pretraining advantage from breath to cough, and HEAR and M2D+RESP reach near-full performance at N = 50 samples while OPERA models require N = 400. Cross-dataset transfer is strongly asymmetric as large diverse data generalises to small clinical populations (CoughVID → CIDRZ: −0.17 yr) but not vice versa (CIDRZ → Coswara: +2.43 yr, +26.6%).
Submission Number: 107
Loading