OmniNet: A multi-modality neural network for robust remote respiratory rate measurement from facial video
Keywords: Remote Respiratory Rate Measurement, Multimodal Learning
Abstract: Remote respiratory rate (RR) measurement has gained traction in recent studies due to its ability to reduce healthcare professionals’ workload and patient discomfort. Recent studies have targeted this problem through remote photoplethysmography (rPPG) to capture subtle facial color changes. However, this technique is sensitive to lighting and motion variations. To this end, we propose OmniNet, a multimodal neural network that integrates image data processed through 3D convolutional neural networks (3D CNNs) with point of interest (POI) motion data and passes the fused features to Bidirectional Long Short-Term Memory (BiLSTM) to model long-term temporal dependencies. OmniNet achieves state-of-the-art performance by effectively capturing comprehensive spatial and temporal information while reducing illumination variation and motion-induced artifacts. It also requires fewer computational resources and enables faster inference compared to Transformer networks. The code has been released on GitHub: https://anonymous.4open.science/r/spiro-7AFD.
Primary Subject Area: Integration of Imaging and Clinical Data
Secondary Subject Area: Learning with Noisy Labels and Limited Data
Registration Requirement: Yes
Reproducibility: https://anonymous.4open.science/r/spiro-7AFD
Visa & Travel: Yes
Read CFP & Author Instructions: Yes
Originality Policy: Yes
Single-blind & Not Under Review Elsewhere: Yes
LLM Policy: Yes
Submission Number: 46
Loading